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ABSTRACT  OF  THE  DISSERTATION 


Bounds  for  Iterates,  Inverses  and  Spectral  Variation 
of  Non-Normal  Matrices 
by- 

Edward  Arthur  Sallin 

University  of  California,  Los  Angeles,  1963 
Professor  Peter  K„  Henrici,  Chairman 

For  an  arbitrary  multiplicative  matrix  norm  and  arbitrary 
non— singular  matrix  N,  we  define  the  condition  number  C^N)  of 
N  with  r  aspect:  to  as  2^(N)  ^(N  1-)  .  A  square  matrix  is  said 
to  be  quasi-diagonal  if  it  is  a  symetrically  partitioned  triangular 
matrix  which  is  diagonal  when  considered  as  a  partitioned  matrix. 

The  following  problems  of  computational  linear  algebia  are 
considered  in  this  paper*, 

(1)  Given  an  arbitrary  square  matrix  A,  to  explicitly 
construct  and  determine  a  bound  for  a  condition  number  of  a  matrix 
N  such  that  Q  *  N  ^  AN  is  quasi— diagonal . 

(II)  To  estimate  the  norms  of  An,  n  *>  1,2,.,,  in  terms  of 
the  eigenvalues  of  A  and  Gy(N) „ 

_ 1 

(III)  To  estimate  the  error  x  —  A  b  of  an  approximate 

"X* 

solution  x  of  the  equation  Ax  *  b  in  terms  of  the  residual 
r  *  Ax  —  b,  the  eigenvalues  of  A  and  C^(N) .  To  estimate  the 
error  X  -  A  of  an  approximate  inverse  X  of  A  in  terms  of 
the  residual  AX  -  I,  the  eigenvalues  of  A  and  Ci,(N)  . 

(IV)  To  estimate  the  distance  of  the  spectrum  of  an  arbi¬ 
trary  matrix  B  from  the  spectrum  of  A  in  terms  of  a  norm  of 
B  -  A,  eigenvalues  of  A  and  Cy(N) . 
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Solutions  to  problems  (11) s  (III)  and  (IV)  are  classical  if  A 
is  normal.  l,e  ,  AA*  m  A*  A.  Solutions  have  been  constructed  for 
non-normal  A,  bur  with  leas  satisfactory  results.  A  partitioned 


Schur  form  of  a  given  matrix  A  will  be  called  an  ordered  Schur 

form  if  (i)  the  eigenvalues  are  lexicographically  ordered  by  blocks 

on  the  diagonal,  and  (ii)  equal  eigenvalues  belong  to  the  same 

block  Let  A  be  ar;  ordered  Schut  form  of  B  and  let  N  be 

chosen  such  that  Q  *  N  ^  AN  is  quasi— diagonal  with  Q  *  diag 

(^,,<^2,  .,Qkk)  where  Qu-Alt  is  of  order  Writing 

0  ■  D  t  L  where  D  is  the  diagonal  part  of  Q  „  and  setting 

11  l  i  i  11 


-  »  X_  where  o'  is  the  spectral  norm  and  X.  is 

1  Qlt  A 


the  spectral  radius  of  A,  we  can  conclude' 


THEOREM.  If  X.  >  0 

—  —  D 


rf(Br)  5  »in[ctf(H>  max  +  (j)^  1<t 


If  xB  *0: 


cr(Br)  <  winJc^fN)  max  f^j 


tf(Br) 


r  >  M 


A1"-", 

,  ,  A.  i 
n  -  1 )  i 


•<■))■ 


1,2, „ . . ,  M  —  1 


where  M  »  max  and  where  the  mimmom  i£  taken  over  all  ordered 
Schur  forms . 

If  fn(,x)  is  defined  for  all  x  >  0  by  f^Cx)  “  x  +  x^  +•  •  •  ■ 
+  x*1  we  have  ; 

THEOREM  If  B  is  now— singular  and  non-normal  and  if 

5i  l  *i 


then 
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d(B  *)  <  min["c  (N)  max 
-  I  o'  i 

where  the  minimum  ia_  taken  over  all  ordered  Schur  forms. 

Let  the  function  gn  "  gn(y)  be  defined  for  all  y  >  0  as 

2 

the  (unique)  non-negativa  solution  of  the  equation  g  +  g  +  ° ° • 

+  gn  ■  y.  Then  if  M  is  an  arbitrary  matrix  with  eigenvalues  X.^ 
and  B  has  eigenvalues  the  quantity 

s  ®  s^CB)  «  max  jam  — 

is  called  rhe  spectral  variation  of  B  with  respect  to  M. 

THEOREM.  For  non— normal  M  with  M  —  B  f  0  we  have  for  any 
norm  l)  dominating  O'; 

sm(B)  <  min  ([max  — - - 

n  —  [_  L  <  n . 

g  l(yt) 

where 

lK\) 

yt "  Cy(N)  ^(u*bu  -~sy 

and  the  minimum  is  taken  with  respect  to  all  U  occurring  in  an 
ordered  Schur  form  of  M„ 


C^(N)  >>(U*bu  -  M)| 


A71}] 


INTRODUCTION 


Unless  otherwise  s  rated.  v  all  matrices  in  this  paper  shall  be 
assumed  to  be  rec -angular  m  x  n  with  complex  elements  and  shall 
be  denoted  by  A  »  (a  ) ,  a  vector  shall  mean  a  column  vector  with 
n  complex  elements, 

A  a  reai^  valued  function  i)  defined  on  the 

space  of  square  mac r ire?  and  satisfying  the  following  relations 
for  arbitrary  matrices  A  and  B  and  arbitrary  complex  scalars 
c: 


(a)  ( Al  >  0;  2)  l  A)  *  0  if  and  only  if  A  *  0 

(b)  V(cA)  *  !  c  I  (A) 

(c)  iJ( A  *  B)  <  i/(A)  +  2ABK 
If  in  addition, 

(d)  ^(ABl  <  il(Al  *>(B) 

the  norm  is  tailed  amlttpl tcatlye .  We  shall  be  concerned  primarily 
with  such  norms, 

A  vector  norm  is  a  real-valued  function  defined  on  the  space 
of  vectors  and  satisfying  relations  analogous  to  (a),  (b)  and  (c) 
above  , 


By  the  spectrum  of  a  matrix  A  we  mean  the  totality  of  its 
eigenvalues,  considered  as  a  point  set  in  the  complex  plane.  The 
largest  of  the  moduli  of  rhe  eigenvalues  of  A  is  called  the 
spe: tral  radius  of  A  and  Is  denoted  by  For  any  invertible 
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matrix 


S  and  multiplicative  norm  iJ  the  condi cion  number. 


Cy(S) ,  of  S  with  respect  to  the  norm  V  is  defined  by 


C/S)  *  i)  (S)  i>(S  ") 


Bauer  [2] 


A  matrix  is  said  to  be  partitioned  (a  partitioned  or  block 

matrix)  if  it  has  been  divided  into  smaller  arrays  by  horizontal 

and  vertical  lines  and  each  of  the  resulting  submatrices  have  been 

represented  bv  a  single  element.  If  an  m  X  n  matrix  A  is 

partitioned  l*  shall  be  denoted  by  A  *  (A^);  the  isj  element 

of  A  being  the  submatr.ix  A^  of  A,  or  order  m  X  n.  where 

ij  i  j 

2  mi  «  m,  2  *  «•  A  partitioning  of  a  matrix  A  by  an  equal 

number  of  horizontal  and  vertical  lines  in  such  a  manner  that  each 
of  the  resulting  diagonal  entries  A^,  are  square  matrices  is 
called  a  symmetric  partition. 

A  symmetrically  partitioned  triangular  matrix  A  will  be 
called  sp— triangular .  Thus 


A  *  0  for  i  <  j,  and 
A  are  (lower)  triangular  matrices. 

A  partitioned  matrix  A  is  said  to  be  quas i-dlagonal  if  it 
is  sp-t rt angular  and  if  A^  *  0  for  i  4  j. 

The  following  problems  of  computational  linear  algebra  will 
be  considered  in  this  papers 

(i)  Ctven  an  arbitrary  sp— triangular  matrix.  A,  to  con¬ 
struct  a  matrix  N  such  that  N  ^AN  is  quasi-diagonal  with 


predetermined  diagonal  entries  and  to  determine  a  bound  for  a  con¬ 
dition  number  of  N„ 

(11)  To  estimate  the  norms  of  the  matrices  Ans  n  *  1,2, .  „  „ , 

in  terms  of  the  eigenvalues  of  A  and  a  condition  number  of  N, 

above . 

- 1 

(ill)  To  estimate  the  error  x  —  A  b  of  an  approximate 
solution  x  of  the  equation  Ax  *  b  in  terms  of  the  residual 
r  *  Ax  —  b,  the  eigenvalues  of  A  and  a  condition  number  of  N. 

(iv)  To  estimate  the  distance  of  the  spectrum  of  a  matrix 

B  from  the  spectrum  of  A  in  terms  of  a  norm  of  B  —  A,  the 

eigenvalues  of  A  and  a  condition  number  of  N„ 

Solutions  to  problems  (ii),  (iii)  and  (iv)  are  classical  if 
A  is  normal .  i.e,,  AA*  *  A* A  Solutions  have  been  constructed 

for  non— normal  A,  but  wi^h  less  satisfactory  results.  Some  of 
the  bounds  given  depend  on  a  knowledge  of  a  matrix  S  in  the 
representation  A  *  SJS  ,  when  J  is  the  Iordan  canonical  form. 
Other  bounds  do  not  approach  the  classical  bounds  if  A  approaches 
a  normal  matrix.  The  bounds  given  in  the  present  paper,  while 
depending  on  the  eigenvalues  of  A  and  their  multiplicities,  do 
not  require  a  knowledge  of  the  Jordan  canonical  form.  Furthermore, 
our  estimates  approach  the  classical  estimates  for  A  normal.  Our 
insistence  on  not  using  the  Jordan  form  is  motivated  partly  by 
reasons  of  computational  convenience,  and  partly  by  the  fact  that 
the  Jordan  form  is  a  discontinuous  function  on  the  space  of  matrices 
and  is  therefore  ill  suited  for  purposes  of  computation  (see  C8) 
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for  related  remarks) 

We  shall  develop  a  canonical  form  for  arbitrary  (non-normal) 
matrices  [problem  (i)  above]  which  is  continuous  in  nature  and  for 
which  we  can  explicitly  demonstrate  a  transforming  matrix  N  and 
its  condition  number.  Such  a  canonical  form  is  developed  in  2.1 
and  3.1.  Representations  for  N  are  given  in  2.21  and  3.2  and 
estimates  for  a  condition  number  are  given  in  sections  2.22  and 
3.2., 

These  results  are  then  applied  to  problems  (il),  (iii)  and 
(iv)  m  Chapters  4,  and  5. 

Certain  estimates  based  upon  a  measure  of  non-normality  of  a 
matrix  have  been  derived  by  Wielandt  in  [29]  and  Henrici  in  [14] „ 
Wielandt's  measure  is  applicable  only  to  matrices  which  are  similar 
to  a  diagonal  matrix.  Henrli t  removes  this  restriction  but  gets 
estimates  in  terms  of  consequently  making  no  use  of  eigen¬ 

values  of  smaller  modulus 


CHAPTER  1 


Preliminaries  on  Norms 


It  will  be  necessary  to  consider  norms  defined  for  rectangular 
matrices.  That  this  can  be  done  is  shown  by  the  following  lemma. 

LEMMA  1 „  Given  a  family  F  of  rectangular  matrices  of 
bounded  row  and  column  dimensions  say  k,  and  an  arbitrary  multi¬ 
plicative  norm  iJ  defined  for  square  matrices,  then  exists  a 
family  of  norms  :  q  >  k  which  are  multiplicative  on  F. 


Proof. 

Let  A,B,C  be  members  of 

F  where 

A 

and 

B 

are 

of  order  r^ 

X.  and  C  is  of  order 

r2  X  V 

Let 

q 

be 

any 

integer  such  that  q  >  k.  In  particular  then  q  >  r^s^.r^s^ 
Define  to  be  the  q  X  q  matrix  formed  from  A  by  the  ad¬ 

dition  of  q  —  r^  rows  of  zeros  and  q  —  columns  of  zeros: 


*1  q~3l 

A  0  \ 

0  0  ) 


rl 

q-ri 


Define  i)  (A)  ■  i^(A  ). 

q  q 

With  this  definition  i>  has  the  properties  of  a  multiplicative 
norm,  for 


(1)  l)  (A)  "  0  implies  7^ (A  )  “0  which  in  turn  means 

q  q 

that  A  and  consequently  A  are  null  matrices ,  i)  (A)  >  0 

q  q 

since  x^(A  )  ■  i)  (A) 

q  q 
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(ii)  V  (cA)  *  -i>[(cA)  1  *  -y(cA  )  *  Icl  V(A  )  “  let  -J  (A)„ 
q  L  qJ  q  q  q 

(ill)  i/^(A  +  B)  “  i^£(A  +  B)^j  »  *  ®q] 

<  o)(A  )  +  *><B  )  -  -J  (A)  +  -0  (B) . 

q  q  q  q 


(tv)  (AC)  * 

q 


AC  0 
0  0 


s  defined,  i 

,e . ,  when 

2  ®1 

r— ^ 
00 

1 

cr 

x  ri  / 

n  ri 

^  A 

0  j 

J  \o 

0  / 

'q~ri 

q-r! 

S1  *  r2  WC  ^ave 
s2  «~s2 


»  A  C  ,  and 

q  q 

V  (AC)  *  iy[(AC)  I  *  VvA  c  ) 
q  L  qi  q  q 

<  l)<k  )  i)  (C  )  *  i)  (A)  -0  (C)  , 

-  q  q  q  q 


We  shall  have  occasion  to  deal  with  functions  defined  on 
scalar  matrices  whose  elements  are  themselves  norms  of  elements  of 
some  fixed  partitioned  matrix.  By  restricting  the  class  of  norms 
employed  we  can  guarantee  that  these  functions  will  be  norms  of 
the  original  matrix.  The  principal  result  is  given  by  Lemma  3- 
If  A  and  B  are  matrices  of  the  same  row  and  column  di¬ 
mensions,  A  <  B  shall  mean  a^  <  b  (i  -  1,2,,., ,m; 
j  *  1,2, .. .,n).  Given  any  matrix  A  “  (a  ) ,  is  t*ie  “atrix 

whose  general  element  is  ia^l  •  A  norm  v1  is  called  monotone 
if  for  A  and  B  of  the  same  dimensions,  i At  <  B  Implies 
o' (A)  <  >>(B) 

A  sufficient  condition  for  a  norm  to  be  monotone  is  given  by 


the  following  lemma. 
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LEMMA  2 .  Let  i)  be  any  norm  such  that  i)  ( IAI )  **  -J  (A) . 

Then  -j  Is  monotone. 

Proof .  Let  A  and  B  be  given  matrices  of  order  m  X  n 
and  let  IAI  <  B.  Thus 

(1.1)  laiJ  <  b  (i  *  1,2, . .  .,m;  j  *  1,2,  ...,n). 

Since,  by  hypothesis,,  i)  depends  only  upon  the  magnitude  of 

each  element  of  A  we  can  assume  that  all  a,  >0.  To  show  that 

1J  ~ 

(A)  <  VCB)  it  is  sufficient  to  consider  the  case  where  only  one 

equality  m  (1—1)  fails  to  hold  say  a  <  b 

By  postulate  (b)  of  the  definition  of  a  norm  we  can  assume 

that  b^^  *  1.  To  simplify  the  writing  we  can  assume  further  that 

we  have  r  *  s  ■  1  and  hence  0  <  a^  <  b^  *  1  and  have  to  show 

that  x)(a1i;b  , ...,b  )  <  oKl (b. . . .  ,b  ).  But  this  follows 

ii  mn  —  l.i  mn 

immediately  if  we  use  the  decomposition 
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and  apply  the  triangle  Inequality  and  postulate  (b)  of  a  norm, 
since  we  then  have 


,b 


12’ 


.sb 

mn 


) 


1  +  «n  1  -  a  . 

<— v(l,b12 . bm)  +  — j-ii 


1  +  a.^  1  a. . 

2  2'>(1,b12,'”“,bmn)  +  2 


*><l,b )  , 

i  £  mn 


■J^(l*b12»  • » « » b^) 


This  generalizes  Ostrowski's  [201  concept  of  coordinatewise 
symmetric  gauge  functions. 

The  use  we  shall  make  of  the  concept  of  monotone  norms  is 
given  in  the  following  lemma. 

LEMMA  3.  Let  A  “  (A^)  be  a  partitioned  matrix.  Let  p  be 
an  arbitrary  multiplicative  norm  and  A  the  scalar  matrix  whose 
general  element  is  yoCA^) „ 

Then  if  i)  l£  a  monotone  multiplicative  norm,  the  function 
N,  defined  by  N(A)  ■  -i)  (A)  is  a  multiplicative  norm. 

Prop f .  We  must  verify  the  four  postulates  for  a  multiplicative 
norm.  Namely, 

(a)  N(A)  «  0  implies  i^(A)  and  hence  A  ■  0.  Then 
/>(Ai j )  »  0,  A  -  0  and  finally  A  ■  0.  N(A)  >  0  since  -J  is  a 


norm.. 


(b)  N(cA)  -  -J(cA)  -  i>[/>(cA  )J  -  j^lclya(Aij)j 

-  tJ(IcIA)  -  I c I  -2>>(A)  -  Icl  N(A) 

(c)  N(A  +  B)  *  t)(A~+B) 

(1-2)  -  4uy  +  BtJ)]  <  v[f(*tj)  +  p<»1J>] 

-  l)(k  +  B)  <  x)(A)  +  ^(B)  -  N(A)  +  N(B) 

(d)  N(AB)  -  tAAB) 

(1-3)  ■  *[f*  1  *lk»kj>]  <-Al  f(i  lk>  P<BkJ>] 

-  2^(AB)  <  -^(A)^(B)  -  N(A)N(B) 


(1.2)  and  (1.3)  hold  since  i)  is  monotone. 

The  following  are  some  of  the  most  common  norms  of  matrices 
A  -  (a^).  (See  [16],  C21)). 


O(A)  -  £  1 1 


ij 


O’(A)  “  max  I 
x^O  L 


x*A*Ax 

x*x 


1/2 


(Spectral  norm) 


jo  (A) 


T  ] 


I  a 


ij' 


y(A)  »  max  £  laj  4 1 
j  i  : 


C(A) 


I 

j 


la 


ij 


(Euclidean  norm) 


The  last  three  are  obvious  examples  of  monotone  norms 
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If  0  is  a  vector  norm,  then  the  function  defined  by 


*  sup 

9  *  #0 


0(Ax) 


always  defines  a  matrix  norm,  Matrix  norms  defined  in  this  way  are 
called  lub  norms  in  [3]  »  The  norms  o',  p  and  y  defined  above 
can  be  derived  in  this  manner  from  suitable  vector  norms  [24] „  On 
the  other  hand,  some  matrix  norms,  such  as  £  ,  cannot  be  thus 
derived . 

We  shall  use  the  following  definitions: 

A  matrix  norm  iJ  is  called  compatible  with  a  vector  norm  p, 
if  p(Ax)  <  xl(A)p(x)  for  all  matrices  A  and  vectors  x.  A  lub 
norm  is  always  compatible  with  the  vector  norm  defining  it. 

A  matrix  norm  i)  will  be  called  uni tartly  invariant,  if 
xKU*AU)  *  x>(A)  for  all  A.  and  all  unitary  U,  The  norms  o' 
and  £  are  unitarily  invariant,  while  C  and  y  are  not. 

A  lub  norm  is  called  axis— oriented  [3]  if  n^(D)  ■  max  Id  I 

l<i<n  11 

“  XD  for  any  diagonal  matrix  D  *  (d^)  .  The  lub  norms  d„  p ,  y 
are  axis-oriented. 

A  norm  -J  is  said  to  majorize  another  norm  p  if  xl(A)>p<A) 
for  all.  A,  The  C  norm  majorizes  o'. 

We  shall  require  the  following  consequences  of  the  defining 
properties  of  a  norm.  [See  [21]  for  proofs]  . 

LEMMA  4,  If  X,  denotes  the  spectral  radius  of  A,  then 
iJ(A)  >  for  any  matrix  norm  t)  . 
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LEMMA  5 ,  If  p  and  i>  are  any  two  matrix  norms,  then  there 
exists  a  constant  P^  ^ .  depending  only  on  these  two  norms 3  such 


for  all  matrices  A. 


p(A)  <  P  -J  (A)  , 
—  pi) 


Values  of  P^^  for  special  norms  are  given  in  [271. 

We  have  finally 

LEMMA  6..  let  D  be  the  quasi— diagonal  matrix  D  *  dg(D^,D2> 

„.,,D  K  Then  cr(D)  *  max  <J(D  )  „ 

1  2 

For  the  proof  we  note  first  that  a  (A)  *  for  all  4 

and  in  particular  O^(D)  *  But  D*D  *  dg(D*Dj  ,D*D2>  . . . ,D*Dk> 

and  the  eigenvalues  of  D*D  are  the  union  of  those  of  D*D^  Thus 
XD*D  *  ®«x  XD*D  and  cr^(D)  *  max  o^ (D^) . 


CHAPTER  2 


Introduction .  This  chapter  is  divided  into  two  sections.  The 
first  is  concerned  with  the  problem  of  establishing  the  existence 
of  a  matrix  N  such  that  N  JBN  is  quasi-diagonal  for  an  arbitrary 
matrix  B.  The  results  of  this  section  are  used  to  give  an  explicit 
representation  for  N,  above,  and  to  estimate  its  condition  number. 

Section  2.1.  We  may  restrict  our  attention  to  any  one  of  the 
triangular  forms  A  of  a  given  matrix  B  since  every  matrix  is 
unitarily  similar  to  a  triangular  matrix,  and  since  we  shall  ulti¬ 
mately  make  use  only  of  unitarily  Invariant  norms.  It  is  true 
further  that  the  ordering  of  the  diagonal  elements  of  A  (the 
eigenvalues  of  A)  may  be  specified  arbitrarily.  It  is  of  im¬ 
portance  to  subsequent  estimates  we  shall  make  that  the  specifi¬ 
cation  of  the  ordering  of  eigenvalues  does  not  uniquely  determine 
the  triangular  form. 

We  assume  then  that 


X1  *  X2  -  "  *  °  ^  Xn 

where  X^  *  a^,  i  •  l,,2,,..sn  are  the  eigenvalues  of  A,  and 
where  X^-<  X^  means  that  either 

(a)  Re  <  Re  Xj  or 

(b)  Re  X^  ■  Re  Xj  and  Im  X^  <  Im  X^  . 

This  is  the  so-called  lexicographic,  or  dictionary,  ordering  of  the 
complex  plane. 
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k 

1 

1-1 


of  order 

n. 

»  n ,  and 

1 

(a)  X  « 
1 

V  s  * 

X, 

1 

(b)  X^  t 

V  < 

s 

p 

These  sets  S, 
k 

uniquely  < 

If  A  has  eigenvalues  X,  ,X.os . ,  „ ,X  we  define  the  (disjoint) 

I  /  n 

^  :,n^,  „  „ „  .,n^  respectively »  such  that 


and 

'"i  ^  Xj ' 


A  if  we  specify  that  the  diagonal  terms  of  each  diagonal  submatrlx 

resulting  from  the  partitioning  belong  to  one  and  only  one  o 

Any  matrix  which  satisfies  the  above  triangularity,  ordering 

and  partition  conditions  -shall  be  said  to  be  in  an  ordered  Schur 

form  or  to  be  an  ordered  Schur  matrix . 

Let  then  A  *  (A^)  ,  ( i ,  j  *  ]  >2f  .  „.,k)  be  any  ordered  Schur 

matrix,  which  we  assume  is  of  order  k,  and  where  the  A  are 

k 

matrices  of  order  n  X  n.  with  X  n.  *  n. 

1  J  i-1  1 

DEFINITION,  denotes  the  row  vector, 

(Eil’El2s,,,’V 


where  E  is  of  order  n  X  n  and 
is  i  s 


E 


is 


0  if  s  $  i 

< 

1  if  s  *  i 

n , 


Here  I  denotes  the  n.  X  n  identitv  matrix  and  I  (with  no 
subscripts)  Is  the  n  X  n  identity  matrix  partitioned  as  A 


above » 
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DEFINITION.  is  the  column  vector 


where  the  E  are  the  transposes  of  the  E^  .  Then 
is  is 


(2.1-1) 


T  T 

E  E,  *  E .  E  * 

1  j  J  i 


j  *  i 


ln  ,  j-1 

l 


It  then  follows  that 


(2.1-2) 


Aij  ‘  Et  A  Ej 


The  matrix  M  -  N^  E  ,  where  is  an  arbitrary  ni  X  n^ 

matrix,  is  such  that 


(2.1-3) 


f  N  for  r  *  i,  s  -  j 

M  »  < 

r*c  0 


rs  0  otherwise 


DEFINITION.  The  partitioned  matrices  E^  defined  by 


Eu  •  WEj  ■  V  ■ 14  Ei  "u  Ej 


where  the  N  are  arbitrary  n  x  n  matrices  for  i  >  j  and 
J  J 

null  matrices  otherwise  are  called  elementary  block  matrices .  That 


II 


is 


<Vrs 


I  r  “  s  “  i 


"u  r 


i.  *■] 

otherwise 


Using  (2ol— 1),  (2.1—2)  and  (2.1-4),  the  following  properties 
of  elementary  block  matrices  may  be  verified: 

k7j(ei‘ej  ;  NiJ  '  Eij[Ei’Ej  ■  -”ij] 


Lij  Ekj  “  Eij  +  Ekj  Io 


Letting 


(2.1—5''  N  -  FT  E  -  TT  E 

J  i-1  13  i> j 


(j  ■  1,2, ...,k  —  1) 


we  have 


That  is 


‘.VV 


(N  ) 

j  rs 


r  *  8  ■  i 


N  r  >  s,  s  ■  j 

rJ  J 


otherwise 


i  ■  1,2, . . .,k 


R7l * 1 "  (& E‘  VV 


Furthermore, 


Letting 


(2.1-6) 


k— 1 

»-TT  ... 

j-1  J 


we  have 


(N) 


rs 


rs 


r  -  s  -  i 

r  >  s 

r  <  s 


(i  - 


k— 1 

n_1  -  TT  . 

J-0  J 


Letting  N  *  (II  ),  wher€ 


*ij  1  >  j 


i  <  J 


we  have 


since  ST  ■  0  for  r  >  k 


N  1  -  (I  +  N)  1  -  I  -  N  +  N2  —  •  •  •  (-l)k  1  1 , 


DEFINITION.  The  function  F  defined  by 


fu(a)  •  et5  a  v 


i» j  “  1,2, ...  ,k 


is  called  an  elementary  block  similarity  transformation.  We  shall 
study  the  effect  of  F  on  the  ordered  Schur  form  A.  For  i  >  j 
we  have  using  (2,1-3) 


(2.1—7) 


W'i  "ij  EJ»r.  '  l  V‘l  "ij  Vp. 

P  8 


Ari  Nij* 


s  i  j 

8  -  j 


(2.1-8) 


{Ei  NU  Ei)A)rs  ’  £  (EI  "ij  »j>rp  Ap. 


0, 


"ij  V 


r  +  1 
r  -  i 


Triangularity  of  A  yields 


(2.1—9)  (B*  Ej)  A(E*  Ej) 


Ei  Vej  *  E>u  Ej 


*  0. 


Using  (2.1-9)  with  i  >  j  we  have  by  a  straightforward  com¬ 
putation 


[F,  .  (A)  J 
ij  rs 


(E  1  A  E  ) 

V  ij  ij'rs 


A„  -  «EI  "ij  EJ>A)r.  +  <A(EI  "ii  V> 


rs 


Coupled  with  (2.1-7)  and  (2.1—8)  this  yields  for  i  >  j: 
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(2.1-10) 

A 

“  * 

r  -  i, 

is 

i) 

(2.1-11) 

m  < 

A 

rs 

+  A 

rt 

N 

ij 

r  4  l, 

(2.1-12) 

A. , 

4  V, 

H,  ,  -  N  AJ5 

r  -  i, 

1J 

ii 

lj  ij  jj 

(2.1-13) 

A 

rs 

otherwise 

Recalling  chat  A  is  triangular  we  note  that 


CF^A)] 


rs 


0 

A 


rr 


r  <  s 

r  ■  s 


We  note  further  that  only  elements  in  the  ith  row  and  those  in  the 

jth  column  of  A  are  altered  by  F^  . 

For  1  *  j  we  have  E^  *  E^  *  I  and  consequently 

CF  (A)]  -  A  . 

ii  rs  rs 

By  (2ol-12)  we  see  that  [F  (A)]  *  0  if  and  only  if  there 

1  J  1 J 

exists  an  X  n^  matrix  N  such  that 


(2.1-14) 


~Aii  MiJ+*iJ  ajj-V 


That  this  equation  is  solvable  may  be  seen  from  the  following 
theorem  which  is  proved  in  the  Appendix. 

THEOREM  A— 1 .  A  necessary  and  sufficient  condition  that  the 
matrix  equation  —AX  +  XB  *  C  have  a  solution  for  all  C  Is  that 
the  eigenvalues  of  A  be  distinct  from  those  of  B . 


Indeed,,  since  A  was  assumed  to  be  an  ordered  Schur  form  and 

i  #  1,  the  eigenvalues  of  A,,  are  distinct  from  those  of  A,,. 

ii  jj 

We  shall  now  consider  the  effect  of  successive  applications  of 
elementary  similarity  transformations,  ,,  on  an  ordered  Schur 
form  A  where  at  each  stage  N  is  chosen  as  the  solution  to 
(2,1-14),  For  this  we  introduce  tbe  following  notation. 

mm 


Let 


“  Fil(A)  *  A 


iS2'1'  *  F 
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»<3l>  .  ,31[a«’1>] 


F,  (A^1 


,  (  i  v ) ) 


,[* 


i  >  J 


|  Flt[A(k>1  1}]  -  A(kji  1),  i  -  j  -  2.3,... ,k 


where  F  is  determined  by  the  condition  that 


Ms) 

i) 


*  0„  i.e,,  that  (2,1-14)  is  satisfied. 


If  we  define  »  A^k’^  ^  for  i  »  2,,  3s..,sk  we  may  write 


A^’^  *  F 


ij 


[Au-ur 


i  >  j,  i  t  1 


A^1*1^  -  A, 


Rewriting  (2.1—10)  thru  (2.1-13)  in  our  new  notation  we  have 
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(i,j> 

rs 


(2 . 1-15) 

(2.1-16) 

(2.1-17) 

(2.1-18) 


U-lsj) 

-  H 

(i- 

lsj) 

is 

ij 

js 

(i— ls  j) 

+  A^*~ 

-1.J) 

N 

rj 

ri 

ij 

(i-l.j) 

+  A^i_ 

-l.J) 

N . 

ij 

11 

ij 

(i-l,j) 

krs 

Since 


*  A 

rr 


(r 


1,2. . . .  ,k)  for  all 


r  "  i,  s  1*  j 


r  4  i»  8  ■  j 


(1-1,  j) 
jj 


is  8 


j 


otherwise 


( i , j )  we  have 


(2.1-19) 


A(isj) 

ij 


*(i— Is j) 

Aij 


+  Aii  "ij  "  "ij 


We  claim  that  the  result  of  application  of  F  in  some  order 

is  to  reduce  A  to  a  quasi-diagonal  form.  That  this  is  true  can 
be  seen  from  the  following  lemma. 

LEMMA  7.  If  A' i_1  ^  *  0  for  s  >  r;  for  s  <  1  where 
—  rs  -  -  - 

s  <  r  <  k,  and  for  s  *  j  where  j  <  r  <  i  —  1  and  if  is 

chosen  such  that  A^j’^  ■  0  in  (2.1—19)  then  A^*’^  *  0  for 
s  >  r;  for  s  <  j  where  s  <  r  <  k;  and  for  s  ■  j  where 
j  <  r  <  i. 

Proof. 

(i)  r  #  i„  s  #  j 

A^.j)  «  A^1-1^)  by  (2.1-18) 
rs  rs 

(li)  r  *  i,  s  <  i . 


2 


By  (2.1-15) 

AU,))  .  „  A5i-lsJ) 

Is  Is  ij  js 

*  0 

since  and  A^1  ares  by  hypothesis,  null  matrices, 

is  js 

(iii)  r  *  i:  $  >  j . 

By  (2.1-15) 

A^1’^  *  A<i_1^  -  N  A^1-1’^ 
is  is  lj  js 

»  A(i-l,j) 


since  A^1  is  lower  triangular. 


( iv)  r  <  i ,,  s  *  j . 

From  (2.1-16)  and  the  triangularity  of  A 


(1-1,  j) 


A(u)  .  Au-i J)  +  a(i-i,j)  N 

rj  rj  ri  ij 


*  A 


(i-lj) 


(v)  r  >  i,  s  *  j. 
From  (2  .1—16) 


A(1:J)  »  A(1  +  1,j)  N 

rj  rj  ri  ij 


(vi)  r  »  i,  s  ■  j 


A^1’^  «  0 
ij 
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by  definition  of  F 


Comments  We  have  proved  more  than  the  statement  of  the  lemma. 
Indeed  we  have  shown  that  the  only  elements  of  A^  which  are 

altered  by  F  are  A^  ^ ^ ,  k  >  i„ 

Thus  the  sequence  of  elementary  transformations  F21,F319 “ 0 °s 

Fkl*  ^32*^42’  "  “sFk2*  °  “  °’Fk  k-l  re<*uces  A  to  a  quasi-diagonal 

form  whose  diagonal  blocks  are  precisely  those  of  A. 

Let  X  denote  the  matrix  formed  by  the  multiplication  of  the 

E„  taken  in  the  same  order  as  the  F  , „  Namely, 
ij  lj 

-ft  TT  v 

j*i  t>j  J 

Then  X  ^AX  *  Q,  where  Q  is  quasi-diagonal  with  •  A^. 

But  from  (2.1—5)  and  (2.1-6) 


IV  A. 

TTTT 

j«l  i>j 


E  *  N. 


Thus  N  *  X  and  we  have  finally, 


THEOREM  l.  For  a  given  ordered  Schur  matrix  A  *  (A^)  of 
order  k.,  let  N  *  (N^)  be  the  sp-triangular  matrix  such  that 
for  i  >  j  N  satisfies 


—A  N  +  n  A  « 
ii  i)  ij  Ajj  Aij 


and  such  that 
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N 


ii 


(i  *  1,2, . o .,k) . 


Then 


N  LAN  ■  Q 


where  Q  Is  quasi-diagonal  with  Q  ■  A  (i  *  1,2, . . .,k) . 


Section  2.2 


Int roduclion  ,  We  shell  begin  this  section  by  determining  an 
expression  tor  the  elements  A^  in  terns  of  elements  of  A 

and  N.  Using  the  integral  representation  formula  for  iil  given 
in  the  Appendix,  we  are  able  to  express  N  as  a  sum  of  certain 
matrices  each  ot  whose  elements  are  integrals  of  certain  functions 
of  the  elements  of  A. 

Using  the  above  representation  of  M  we  are  able  to  give  a 
bound  for  C^CN) .. 

By  further  restricting  the  form  of  A  we  are  able  to  give  a 
bound  for  a  condition  number  which  does  not  require  the  explicit 
computation  of  N  , 


221  A  representation  for  N 

We  begin  with  the  following  lemmas. 
LEMMA  8  o  For  j  +  p  <  r 


(2.2-1) 


A(j+P  J> 
rJ 


o 

x 

(®0 


?  A  .  a  *  ,  „  a  .  >  «  .  f 

J'Ht.  3  1  i 


(j  -  1  s 2 k)  . 


I, EMMA  9 ,  If  r  >  j .. 

(2-M)  C1,1’  '  %  lrt  %■ 

Lemma  9  is  a  particular  case  of  Lemma  S  where  p  *  (r  —  1)  —  j. 
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Proof  oJ[  Lemma  8.  For  p  *  1,  we  have  by  (2„1— 16) 


A(J+l.j) 
r j 


*  A(j  +A(J  ij  N  ,  . 

rj  r  ]*!  JHJ 

*  A(k'J_l)  +  A(k  |7l)  N 

rj  r  j-1  J+1,  j 


-«  A  +  A  ,  N  , 
rj  r  j  +1  j  +1  j 


&  m  si«.y 


The  validity  of  the  next  to  last  statement  follows  from  Lemma  7„ 
Assuming  now  that  (2.2—1)  holds  for  p  —  1  we  have  by  (2.1—16)  and 
Lemma  7 : 


A 


(j+P.j) 

rj 


*  A 


(  j  +p —  1. ,  j) 

rj 

.( j  tP~i  j) 

rj 

p-l 


& 

! 

C*o 


t.  i+i 


r,  j  H 


A 

+  A 


N 

J 


<  j+P~l:  j)  v, 

r  J'*P  j^P  J 

r  j+P  Nj+p,j 

]  Ar  ]+p  ^j-rp.J 


N 


J+<.j 


which  is  the  statement  of  the  lemma., 

We  now  reproduce  an  integral  representation  for  H  .  The 

A  J 

proof  of  the  validity  of  this  representation  and  related  results 
are  to  be  found  in  the  Appendix. 

THEOREM. 


ij 


poo 

*0 


—A  t 

11 


.<1-1  j) 

ij 


V 

e  J:|  dt 


Let  now 

-V 

A  *  A  — 

dg(A)  , 

A  is 

then  a 

strictly 

lower 

triangular 

matrix .. 

Define 

now 

a5°> 

-  1 

A^> 

*  (A^> 

where 

A°) 

r* 

*  -  r°° 

*0 

-A  t 
rr 
e 

(a  a<0) 

A  t. 

)  e.  88  dt* 

rs 

A<2) 

*  <A(?)) 

T* 

where 

A2) 

A 

rs 

p  oo 

*-«/o 

-A  t 
rt 
e 

(A  AU) 

A  t 

v  SB 

)  e  dt, 

‘  r  s 

or  in  generals 

A(ptl)  •  a(p*l5>;  A<pt1) 


rs 


/non  Arr  '  ~(p).  Ass1'  , 

JQ  <?  (A  A  e  dt‘ 


The  above  matrices  will,  net  be  defined  in  all  cases  when  the  eigen¬ 
values  of  A  are  complex  In  this  case  we  alter  the  definitions 


to  read 


A(0)  *  I 


~(ptl)  .  (£<PH> 

rs 


^(p-tl) 

rs 


_  r 00 

Jq 


ie 


r  -ie 

exp  j  e 


A  r. 

rr 


(A 


AlP'). 


€Xp 


[e10  *„*] 


dt 


where  8  is  determined  by  the  eigenvalues  of  A  and  is  given  by 

Lemma  E  of  the  Appendix,  No’ me  that  for  any  lower  triangular 
B  t 

matrix  B,  e  is  lower  triangular  we  see  that  the  diagonal  entries 
of  A  and  henre  of  A  are  all  zero,  the  first  two  diagonals  of 

-nY  21  -■  ■>(  r  1 

A  vanish;  indeed  A  <*  0,  r  >  k., 


We  ate  now  able  to  represent  N  explicitly  as  indicated  in  the 
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introduction.  Namely,  we  have  the  following  theorem. 
THEOREM  2.  If 


(2.2-3) 


M  *  I  +  A^  +  X^  +  •••  +  A*k  ^ 


then  N  «  M 

Proof.  Nfi  *  I_  (j  *  1,2,  ...,k)  which  agrees  with  the  above. 
_  jj  n^ 

*  0  for  l  <  j  according  to  (2.2-3).  But  N^  «  0  for 
(  <  j  also,  since  N  is  lower  triangular  by  construction.  Assume 
now  that 


»  (I  +  A(1)  +  A(2)  +  •••  +  A(k  1))jfj;  j  <  l  <  i. 


Then,  since 


1+1,  J 


e  Al+1*1+lt  a<i’l>  .V 


i+1*  j 


dt 


and 


i+1 » j 


1 

2  Km  0  N/4 

(“j  1+1*  f  *J 


we  have  from  Lemma  9 


Ni+l,J  “  ~^o  * 

fa 


■A  ,  t  f  i  1  A,.t 

-  ’  »(J  • ji  « 


-  $-■ Win  {j,  wi£,+*is) ♦■••+4r>,[ 


poo  ^i+l,i+lt 

Vo  e 


|£  WV*n--: * 


:<->>}  .V* 
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PCD  Ai+l,i+lt 

Jq  e 


l 


(A  A^i  +  (A  )  + 

(  ' i+1 9 j  U  A  'H-l,] 


+  (A  2).  +  (A  A^k  1 

iA  A  'i+i.j  'A  A  'i+lsj 


t 

dt 


+  A<2> 

+  •  +  A 

i+1,  j 

1+1.  j 

(I+A(1)  4 •  ... 

4  A(k-a)). 

i+i,j 


i+lsj 


»v 

+  A 


(k) 

l+l,  J 


«v  >v(  |r  ) 

since  A  *  A  for  r  >  s;  A  *  0  and  I. .  *  0 ,  The  above 
rs  rs  *  i+l5j 

induction  step  completes  the  proof- 

It  is  appropriate  at  this  point  to  consider  the  behavior  of  N 
as  B  approaches  normality .  Let  us  consider  the  given  matrix  B 
and  A  *  U*BU  where  A  is  an  ordered  Schur  form  and  U  is  unitary. 
We  put 

A  -  D  +  M, 


where  D  denotes  the  diagonal  matrix  whose  main  diagonal  coincides 
with  that  of  A.  Since  e  is  unitarily  invariant.,  [e(B)]2  ■ 
U(A)]2  -  [e(D)] 2  +  (e(M)]2.  It  follows  that 


e(M) 


[e(B)]2 


is  independent  of  the  special  choice  of  ordered  Schur  form  Noting 
that  B  is  normal  iff  e(M)  ■  0  (see  [19],  Theorem  10.3.8),  we  see 
that  B  and,  from  the  continuity  of  the  Schur  form,  A  approaches 


normality  as  e(M)  — >  0  or,  what  is  the  same  thing,  as  the  off- 
diagonal  elements  of  A  approach  zero.  The  elements  of  the 

*v(  |  ) 

matrices  A  i  *  l , 2, . . . ,k  —  1  depend  continuously  on  the  off- 
diagonal  elements  of  A  and  thus  approach  zero.  Finally  then, 

N  — >  I  continuously  as  B  aproaches  normality. 

2.22.  A  bound  for  the  condition  o_f  N.  It  is  now  possible  to 
give  an  upper  bound  for  a  condition  number  of  N. 

Let 


L 


*<» 


+  A 


(2) 


+  i(k-» . 


Then  L  is  strictly  lower  triangular  and  Lr  «  0,  r  >  k.  Since 
N  -  I  +  L; 

If1  »  (I  +  L)~l  -  I  -  L  +  L2  -  •••  (-l)k_1  Lk— 1 . 


We  have  then  the  following  theorem. 


THEOREM  3 .  For  any  multiplicative  norm  -jJ , 


Cy(N)  <  V(N)  [nXD  +  i/(L)  + 


.  ,  -  A(1>  X  A<2>  u.  .  r(k-l) 

where  L  *  A  +  A  +  -  -  +  A 


+  CV(L)lk  L] 


2.23.  Restricted  Schur  forms  and  a  new  bound  for  C^(N) .  Use 
of  the  bound  for  C^N)  given  in  Theorem  3  requires,  of  course,  the 

~(i) 

calculation  of  A  ,  i  *  1,2, ...,k  -  1.  Due  to  the  prohibitive 

nature  of  the  calculations  required  we  shall  derive  a  new  bound  for 

~(i) 

Cy(N) .  This  bound  does  not  require  the  calculation  of  the  A 
but.  further  restricts  the  Schur  form  and  does  not  in  general  yield 


3 


as  sharp  a  bound  as  that  given  by  Theorem  3. 

We  begin  by  finding  a  bound  for  the  norm  of  Xa  where 

y  n  oo  -At  r  Bt 

X  "  '  Jq  6  C  e 

and  A  and  B  are  lower  triangular.  It  will  be  necessary 
certain  assumptions  regarding  the  eigenvalues  of  A  and  B 
need  however^  to  prove  the  following  lemma  first. 

LEMMA  10.  Let  B(t)  be  any  Riemann  integrable  matrix 
(i  .e . ,  each  element  of  B(t)  i£  integrable) s  and  let  I  I • I 
an  arbitrary  matrix  norm. 

If 

A  *  P°°  B(t)dt 
‘'0 

then 

I  I A I  i  <J^°  I  I  B(  t)  I  Idt. 

Proof .  Let 

A(x)  B(t)dt. 

Then 


A'(x)  *  B(x)  and  A(co  )  *  A. 

By  a  result  of  Dahlqulst  [73  we  have  for  every  matrix 
and  norm  I  I » I  I  that 

(2.2-4)  i i A(x) 11“  <  i 1A° (x) I  I . 


to  make 
We 

function 

denote 


A(x) 
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Integrating  (2.2-4) 


l!A(x)l!  -  i  iA(O)  i  i  <J'*  I  f  A6  (t)  I  Idt 
or 

!  I A(x)  I  (  <PX  !  ( B(t)  !  Idt. 

~°0 

Hen'  e 

! ;AI i  <,/>°°  ! I B(t) I idt. 

-•'0 


We  now  turn  to  the  prebit®  of  finding  a  bound  for  the  norm  of 


X . ..  r™  ~At  -  Br 


‘'O 


e  C  e  dt , 


We  shall  assume  that 

(1)  A,  B  are  lower  triangular  matrices  of  order  Og 

re spec  lively 

(2)  The  diagonal  entries  of  A(B)  art  all  equal  to  some 

constant  we  shall  denote  by  ,v  (X_).  i.e.  A(B)  has 

A  B 

only  one  eigenvalue  (Xg)  repeated  (iig)  times. 

(3)  XA>XB 


A  and  B  can  then  be  written  as 


A  '  VAr  f  LA 

B  *  V  *  lb 

where  and  Lg  are  strictly  lower  triangular. 

Since  X.I  and  X_ I  are  scalar  matrices  they  commute  re- 

A  o 

spectively  with  LA  and  lg 


and  we  have 
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-lXAHL)t  -X4It  -L  t 
e  -  e  *  e  e 


Bt 
e  *  e 


(XBI+l?)t  XBIt  Lgt 
*  e  * 


a.  *T  a—  1  n  — 1 

o  Xi)It  A  B 

—At  .  »t  BA  r7  n  ,  ,  ■, 

e  Ce  *  e  XX 

r-0  i*0 


1  I* 
r  A  _  B  t+s 
~T  l*  ~T  t 


For  any  axis-oriented  multiplicative  norm  il  l!  «*■  have  from 
Lemma  10  that 


(X--X.)t  nA  1  1 

I*"  ll  ~rb  MLA"r  '  Ci>  "S'1*  f  dt 

u  r-0  s-0 


ii  A  1  ti—  1.  | 


!  C !  I 


A  B  f 

1  1  tV  iiLAl,r  UV 

-0  s-0  I  r  S“  * 


r*0  s“0 

n  1  u  “^1  f 


s  poo  ^XB  XA}t  r+s  1 

Jo  *  £  d 


.CM  *  Vi  l  o*  ».J 


A  B  r-0  s-0 


.cm  YiYVr-v'ni*"'t 


^A  ^B  r-0  s-0 


'  >'V 


KA  H/  \ 


Iv-V 


Note  that 


R  S 

I  1 

r-0  s-0 


r  +  s\  r  ,  R1S  a  k  xPyk~P 


3 


k-0  p— 0 


p  l  (k-pV 


R“*-S  k  /  k  \  1 

2  2  0‘P> 

k-0  p«0  v  p/ 


R+S 

X  (*  +  y) 

k*0 
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We  have,  upon  substituting  ( 


MV  I,  (B  **  iiLjjli 


! XI  t  < 


1  1 C  it 

XA  ~  XB 


W2 

s 

k«0 


'h  * 

V  A  ~  Kb/ 


or 


(2 ,2-5) 


I  !Xi  * 


i  C  M 


W2 

I 

k«0 


*AB 


*A  *B 

^  "  XA  ~  S  * 

This  matrix  X  isf  of  course,  the  solution  of  the  matrix 
equation  —AX  +  XB  *»  C  provided  X  >  X^ .  If  however  we  only  know 
that  XA>-  Xg  the  solution  to  the  matrix  equation  is  given  (see 
Lemma  B,  Appendix)  by 


v  poo  i©  I  10  ,1  .  f  10  _  1  , 

X  ■  —  e  exp!-e  AtJ  C  expje  Btjdt 

where  0  is  given  by  the  above  cited  theorem.  We  may  then  general¬ 
ize  (2,2-5)  in  the  case  where  , 

LEMMA  11.  Let  A,  B  be  lower  triangular  matrices  of  order 
nA>  nfi  respectively,  with  repeated  roots  X^.,  X^  respectively .  If 

XA>XB  ml 


X  “  ~c/q°°  e1^  exP  At"]  c  expje1®  Btjdt 
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Is  a  solution  of  the  matrix  equation  -AX  +  XB  ■  C  then 


IXi  I  < 


MCI 


w? 


r*  k 


k-o  HAB 


*  ?A  f  fB 

qA®  '^A-V 


The  proof  is  similar  to  that  used  in  deriving  (2.2—5)  In  which 
IX  —  X„l  replaces  X.  —  X_  and  will  not  be  repeated. 

A  x>  Ad 

DEFINITION.  A  restricted  Schur  form  is  an  ordered  Schur  form 
in  which  the  sets  introduced  in  section  2.1  each  contain  only 

(repeated)  eigenvalue. 


Let  A 


U 


)  be  such  a  form.  Then  we  may  write 


a  »xm 

rr  r  r 


A  *  X  I  +  L 
ss  s  s 


where  A  (A  )  is  of  order  n  (n  )  and  L  and  L  are  strictly 
rr  ss  r  s  r  s 

lower  triangular  matrices.  If  Xf  >  Xg  let 


rs 


IX  -  X  I 
r  s 


n  +n  -2 
r  a 


k*0 


‘rs 


where 


‘rs 


t  +  f 

_ r_ _ _s_ 

IX  -  X  I 

r  s 
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and  (  *  ML  II,  (  •  I IL  I  I  for  an  axis  oriented  multiplicative 

r  r  s  s 


norm,  I  I  - 1  I  „  From  Lemma  11, 


WrfT  eie*xp[~eie  ^r1]  C[e10  ASSt]dtll  —  ^rs  MCil 


Sin^e 


~(p)  p oo  i©  f  i©  I  ~(p— 1).  f  i©  .  .1 

Ars  Jo  6  ^PL"6  ArrtJ(AA  >rs  expLe  A..tJdt 


!  !A(p) I  I  <  y  I IA  A(p_1)) 


rs  —  rs 


For  r  >  s,  let 


(2.2-6)  Y'y  -  ^  i  iA  I  I  I  I A  II 

?  s<n<c  n2<-  °  *<n  j<t  rnp-l  np-l  '  np-2 


I i A  !ly  y  °  °  ’  y  p“l,2,.,.,k  —  1 

n.s  rs  n  ,s  n,s  r  *  * 

1  p— 1  1 


Y(0)  -ill  II 
rs  rs 


For  r  <  a,  let 


„  o 

rs 


(p  -  1,2,  . . . , k  -  1) 


Y<0)  -  I  II  II 
rs  rs 


LEMMA  12 „  For  r  >  s ,  p  -  0, 1, . . . ,k  -  1 
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I  IA^  I  I  <  Y^P*  . 
r  s  -  rs 


Proof o 


~(0) 
I  IAV  ' 
rs 


!  IAU)  !  I 
rs 


III  II 

rs 


l-^00*16  expb10  ArrC]  (A  li0)\s  6xp[eie  A..t]dt|| 


1  IA 

1 1 

rs 

rs 

i  IA 

1 1 

rs 

rs 

(i) 

rs 

° 

Assuming  the  inequality  holds  for  p, 

|-o/2°°e10  exp[-ei0  A^t]  (a^)^  expfe^tjdt 


I  IA^pfl)| I 
rs 


<  y  ll(AA(p))  II 

—  rs  rs 


<Y  II  I  A  A(p) 

-  'rs  i  I  m  ns 

s<n  <r  dp 
P 


IIA  II  iiA(p)l 


5  Y  X  -  -  -  -  _  ■ 

rs  ,  m  ns 

s<n  <r  p  p 

P 

<  y  y  mam  y  ha  ii 

rs  s<np<r  rnp  s<n1<n2<”*-<  np  VV"! 

1  iAn  .  ,n  **^,8**  "*n  s  Yn  .s  ”*  Yn.  s 

p-1  p—2  1  p  p-1  1 


2>  IIA  II  IIA  I 

s<n1<n2<“",<np<r  rnp  np-l#np-2 
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'V  \  8  \  - 

I  p  p~I 


n^s 


v(p+l) 


rs 

which,  is  the  statement  of  the  lemma  for  p  +  1, 
Define 


A(P>  *  (l!A(P>M) 
rs 

Y^  -  (Y^) 


N  -  ( I  IN  II) 
rs 


for  rss  •  1,2, ...,k,  p  ”  0*1,.  ..»k  —  1, 
Then,,  recalling  that 


7(1)  .  r(2) 


N  «  I  +  A  +  A'  ’  +  •••  +•  A 


rOt-1) 


we  have 


N  <  1  +  A^1J  +  A(2)  +  +  A(k  1) 

-  k 


<  I  +  +  Y(2)  +  •••  +  Y(k 

—  k 


for  A^P^  <  Y^P^  in  view  of  Lemma  12. 
If  V  is  a  monotone  norm 


tAN)  <  ^(Ik  +  Y(1)  +  Y(2^  +  •••  +  Y*k  1)), 


But,  as  shown  in  Lemma  3,  the  function  f  defined  by 
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f(M)  *  -zJCM)  for  ail  k  K  k  partitioned  matrices  M,  is  a  norm. 
We  nave  then  "he  following  result., 

THEOREM  4»  Let  A  »  (A_)  be  a  restricted  Schur  form »  I  I  °  I  I 
a  multiplicative  axl  •— oriented  matrix  norm  and  ~i>  a  monotone  norm. 
Then  rb.e  function  f  defined  above  is  a  norm  and 


£(N)  <  ^(Ik  4  Y*1'  r  Y{2)  4  ...  4  Y^”^) 


where  the,  elements  of  are  given  by  (2.2—6) 

Wr  1 1  mg 

N  *  I  *  F 

wi  t.h 

P.X‘  .  ,A<k-», 


N  1  *  (I  +  P)  1 

*  I  —  P  +  P2  — 

,  . ,,  k— 1  _k— 1 
l-i»  P 

(2.2-7) 

s 

t  CP)  +  [P^  - 

••  f  [Pk_13 

i!k 

A  A  0 

*  P  +  (PV  t  - 

t  (Pl**- ‘ 

/N 


since  IPr!  <  (P) r  for  all  r. 

„  ,  2  *(!'  A(f  21 

Noting  that.  P  <  A  ■*  A 


4  A 


(k— 1) 


<  Y<  J  ^  4  Y^2'1  4  • . .  4  Y^  ^ 


and  setting  I,  Y^*'  +  Y^2'  t  . •  •  4  Y^  ^  we  have  from  (2.2—7) 


[N-ll  <1.  4  L  *  L2  t  ■  •  •  4 
k 
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For  any  monotone  norm  iJ , 

O  ,  vi 

d  <  -Jd^  +  V(L)  +  [^(L)r  +  +  (V(L)r  \ 

-1 

Setting  f(N  )  *  nJ[N  1  we  have  by  Lemma  3  that  f  is  a 
norm  of  N  *  and  consequently  arrive  at.  the  following  theorem, 

THEOREM  5,  With  the  same  hypothesese  as  in  Theorem  4  we  have 


Cf(N)  -  f(N)  f(N  L) 

<  i»(Ik  +  L)  [^(Ik)  +  J(L)  +  [JCL)12  +  ■•••  +  L^)(L)lk_1] 

where  L  -  Y^L)  1-  Y(2"  +  •••  +  Y(k~L\  the  Y*“p)  defined  as  in 

(2.2-6). 


CHAPTER  3 


Introduction.  The  representation  for  N  given  here  is 
Identical  to  that  given  in  the  preceeding  chapter.  This  follows 
from  the  uniqueness  of  the  solution  of  the  matrix  equation 
—AX  +  XB  *  C  guaranteed  by  the  assumption  of  an  ordered  Schur  form. 
Tbs  preceeding  chapter  has  demonstrated  the  formal  methods  for 
block  elimination  and  at  the  same  time  indicated  the  complicated 
process  of  determining  N.  Computationally  it  will  be  seen  that, 
the  methods  of  this  chapter  are  superior  to  those  of  Chapter  2. 
Proofs  given  in  that  chapter  shall  be  adopted  here  specializing  to 
the  case  in  hand.  The  development  of  this  chapter  is  more  straight¬ 
forward  than  that  of  the  preceeding  and  is  recommended  for  practical 
applications.  Chapter  2  should  then  primarily  be  considered  for  its 
theoretical  value. 


Section  3.1.  A  scalar  development  for  N. 

We  present  here  a  scalar  analogue  to  the  material  developed 
in  2.1.  We  again  assume  that:  A  ■  (A^),  i,j  *  l,2,...»k  is  an 


ordered  Schur  form  with  A^  of  order 


IV 

ik  X  n,  with  $  n  ■  n. 

1  J  i-1  1 


Letting  a^  denote  the  (i,j)  element  of  A  considered  as  a 
scalar  matrix,  we  say  that  the  pair  of  indices  (isj)  with  i  >  j 
is  of  type  P  and  denote  this  by  (i,j)  e  P  if  *  is  not  con¬ 
tained  In  any  of  the  diagonal  blocks  of  A  considered  as  an  ordered 
Schur  form. 
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If  e  denotes  the  L  a  n  row  vector  whose  only  non— zero 
element  appears  in  the  tch  position  it  follow*  that 


t  T 

e  <  .  1  e „  e 

1  j  j  l 


r  o  1 1 


j  f  i 


i.  j  *  i 


a  .  *  e  A  e. . 

J 


DEFINITION  The  matrices  ^  defined  by 


fa  .  -  §  [e  e  .  ,n  1  »  I  *  e  n  ,  e , 
ij  i]  t  j  tj  i  lj  j 


where  n  is  an  arbitrary  complex  number  for  (i  j)  t  P  and  zero 
otherwise  are  called  P-eiemetuary  matrices. 


>  j; 


<6.j’rs  "j  'u  ",i’) 

j_0  otherwise 


The  following  are  imedVa*:®  results  of  the  above  definitions. 


vv  *  5.)‘vr"n] 


( 31-1) 


-  n  s,  "  n 


1  1  1  i  11  “'ll 

J  1  *1  J  i>j  j 
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That  is 


Furthermore 


Letting 


(3  J-2) 


N.  -  I  4  (  l  J  n  )e 
J  i>j  i  i]  J 


f  1.  r  -s-l 

.)  »  S  n  .  s  -  j 

j  rs  -  rj 

i  0  otherwise 


"/■‘-‘J,  *Inu'  V 


»  -fT  «. 


3*2 


we  have 


N 

rs 


r  s>  '■*  i 

otherwise 


We  proceed,  as  in  Chapter  2S  to  eliminate  one  by  one  all  those 
elements  whose  subscripts  ace  of  type  P„  In  this  manner  we  shall 
again  arrive  at  a  quasi-diagonal  matrix. 

DEFINITION,  The  function  f  .  defined  on  all  matrices  A  of 

1J 


ftJ(A3 


I,’  A  g, 


13 


order  n  by 


Is  called  a  P  elementary  similarity  transformation . 


Let 

A  *  f  (A) 

*  J 

Using  results  similar  to  (2  1— 7) ,  ( 2 1 — 8  and  (2„l-9)  we  have 


(  3  1—3) 

>*“ 

“is  ~ 

n  .. 

ij 

a 

js 

r  *■  l  ,  s  # 

J 

(3,1-4) 

a  .  + 
tj 

ari 

»v  .. 

ij 

r  t  is  s  * 

j 

(  3.1-5)  ars  *  ‘ 

a  * 

ij 

ali 

n  .  —  n,  .  a 
ij  ij  ;.i 

r  *  i  s  * 

j 

(3,1-6) 

r 

0 

m 

otherwise 

Only  elements  in  the 

1th 

row  and  those 

in  thejr.h  column  whose 

subscripts  are  of  type  P 

are 

affected  by 

f  and  A 1 

ij  rr 

“  Ar:> 

r  -  1  7  .  k 


Bv  (3ol-5),  *  0  if  and  on_y  if  there  exists  a  complex 

-lumber  n  .  such  that 
ij 


(3  1-7) 


ii  ij 


+  n 


ij  3  3 


That  this  is  always  possible  can  be  seen  by  choosing 


'ij 


AL 


jj 


*li 


The  denominator  nev«r  vanishes,  for  (i,j)  fe  P  which  precludes  the 
possibility  that  *  a^- 
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We  now  study  the  effect  of  successive  applications  of  P- 
elementary  similarity  transformations,  f  ,  where  at  each  stage 
[n^  .]  is  chosen  such  that  equations  similar  to  (3.1—7)  are 
satis  fied . 


Let 


A(la)  *  fu(A)  *  A 


A(t'J>  -  f 


ij 


^<1-1, j)j 


( i  1 ) 

where  f^  is  chosen  such  that  a^^  “  0  in  (3.1-7).  We  may 


then  write 


A(l,j)  .  g^l  A(l  l.J)  (i,j)  t  (1,1) 


A(1’l>  -  A 


Rewriting  (3.1-3)  through  (3.1—6)  in  our  new  notation  we  have 


(3.1—8)  a 


(i»j) 


r,s 


(1-1, J) 
'is 

—  n 

i  J  rs 

r  ■  is  s' 

<1-1,  j) 
rs 

ri  ij 

r  f  i„  s' 

(i — 1 » j ) 

♦  .“-‘■J)  . 

r  ■  i.  s' 

ij 

(1-1, J) 

rs 

ii  ij 

n 

othe rwise 
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Since 


a(i» j> 

rr 


r  ■  1,2,  „  „  „  ..n,  we  have 


(3,1-9) 


.(Is  j) 

i  j 


*ij  + 


a.,  n.. 
li  ij 


—  n 


ij  jJ 


We  claim  that  the  result  of  application  of  f  in  some  order 
is  to  reduce  A  to  a  quasi-diagonal  form.  This  follows  from  the 

non  r  lemma,. 


LEMMA  13.  Suppose  we  have  *  0  for  s  >  r;  s  <  j 

with  s  <  r  <  n;  s  "  j  with  j  <  r  <.  i  —  1  and  let  b£  chosen 

such  that  a^j*^  *0  in  (3/:— 9);  then  a£*  *  0  for  s  >  r; 

a  <  j  with  s  <  r  <  n;  s  *  j  with  j  <.  r  <  i. 

The  proof  is  analogous  to  that  of  Lemma  7  of  Chapter  2  and 

will  not  be  repeated.  As  in  Lemma  7  the  proof  shows  that  the  only 

elements  of  A^  which  are  altered  by  f  are  [A^  » 

ij  kj 

k  >  i  with  (k,j)  e  P. 

Thus  the  sequence  of  elementary  transformations,  determined  by 
eliminating  each  element  with  indices  of  type  P,  progressing  down 
each  column  first  and  then  by  columns  left  to  right,  reduces  A  to 
a  quasi— diagonal  matrix  whose  diagonal  entries  are  precisely  those 
of  A. 


let  X  be  the  matrix  formed  by  the  multiplication  of  the  ^ 
taken  in  the  same  order  as  the  f .  Namely 


ij 


11 

n  n 

j«l  i>j 


3i  j 


But 


Then  X  ^AX  *  Q,  a  quasi-diagonal  matrix  with  »  A^ . 

from  (3.1—1)  ai-  (3.1-2) 


n 


n 


*  N. 


Thus  X  “  N  and  we  have; 

THEOREM  6.  Let  H  ■  (n  )  be  a  triangular  matrix  such  that 
\j  a»ttsfiea  -a11  f  a  *  1°I  (isj)  e  P; 

”tj  *  0  for  (l  j)  i  P  except  that  n^  *  1,  i  *  1,2, 
then 


N  LAH  •  Q 


where  Q  is  quasi-diagonal  and  *  A  1  *  L,2,...,k. 


Section  3.2.  A  bound  for  a  condition  number  of  N. 

This  section  is  devoted  to  the  determination  of  a  bound  for  a 
condition  number  of  N,  where  N  was  defined  in  the  preceeding 
section.  We  begin  by  finding  an  expression  for  the  element  of  the 
form  a^1’^  which  involves  elements  of  A  and  N.  Finally  we 
find  an  expression  for  N  which  involves  only  elements  of  A. 

LEMMA  l4«  For  j  +  p  <  r„ 


(3.2-1) 


,(j+P» j) 


P 

X  « 

(m0 


r,jH  j+£,J ' 


Proof.  We  shall  prove  this  lemma  by  induction  on  p. 
For  p  »  1  we  have  by  (3.1-8) 


rj  rj  r„  j+1  j+l,j 


(3.2-2) 


arj  *  arP  j+1  "j+l.J 


a  .  +  a  ..-in.,,, 
r)  r,  j+1  j+l»j 


(l0  a^iH  VU 


(3,2-2)  holding  by  virtue  of  Lemma  13. 

Assuming  (3.2—1)  holds  for  p  —  1  we  have  by  (3.1-8) 


50 


51 


a(j+P> j> 
*rj 


a(j+p-Uj) 

rj 


+  a 


(j+P~l»J> 

rsj+p 


nJ+P»j 


(3.2—3) 


.  (J+P-1-.J) 
rj 


+  *rsj+p  "j+Psj 


“rsj+(  nj“Hf»j  +  *r;]+P  nj+psj 


P 


(3.2—3)  holding  by  virtue  of  Lemma  13. 

In  particulars,  we  have  for  p  *  (r  —  i)  —  j 

LEMMA  15.  If  r  >  j , 


a 


(  r— 1 .  j  ) 

rj 


r— 1 


As  we  have  seen  before 


n 


ij 


a(i-U> 

— -  for  fi,  j)  e  P- 

4  “  ft 

jj  ti 


Recalling  that  the  diagonal  entries 

we  may  write  X.  for  a,^  and  X. 

J  j)  i 


ajj 

for 


of  A  are  its  eigenvalues 
a  .  Then 


n 


ij 


<1-1,  j) 

!iJL _ 

\  ,  \  a 


Let 
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and 


ij 


A  “  A  diag  (A^ R  A2 » . © .  d A^) 


A  is  then  a  (strictly)  lower  triangular  scalar  matrix  such  that 


ij 


a^  for  (i,j)  e  P 
0  otherwise 


Define 


A<°>  -  I 


A<»  :  A<‘>  -  (A 


and  in  general 


r(r)  .  -At)  rc 

A  .Atj  -(AA  0^, 


r) 

Note  that  A  ■  0,  r  >  k  since  each  successive  multipli¬ 
cation  by  A  introduces  at  least  one  new  diagonal  of  zero  block 
matrices . 

THEOREM  7. 

N  -  A^  +  A^  +  °  °  °  +  A^k-1^  . 


Proof o 


+ 


ij 


’Jj"1 

0  for 


-(0)  ^(1) 

by  construction  and  (A  +  A  +  ... 
(i,j)  i  P„  which  agrees  with  the  above. 
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Assume  now  that 


Vj 


(A(0)  *  AU)  +  •••  +  A(k  l))j  for  J  <  l  <  i,  «,j)  6  P 


then 


,u«j) 


n  , 


i+1 


j _ 


t+l  ,J  A  -  Ai+1  "i+1  J  i+1  ,  j 


a1+l3{  n£j)  ®i+l  f  j 


/  i  \ 

ai+i  ?  i  "(yVj 


v  ~  ,r(0)  *  2<l>  *  .  *  rOe-lK  l 

9i+U  (A  *A  +  +A  )£jj  °i+l,j 


v  ~  /r(o)  +  j'Ci)  ,  .  r(k-ik  -I 

c5j  *i+i,£  U£j  AfJ  +’“+A0  }  Vi.j 


<A  X(0)>i+i?j  +  Ck  X<l))i+ij  +  -  +  (X  X(k"1>)i+i,j  °k+i.j 


X<D  +X<2)  +  ... 

i+l,j  i+l,J 


+  A*k-1>  +  A*k) 
i+1, j  i+1, j 


(A<°>+A(1>  +  ...+A<k-1)) 


i+1  -  j 


54 


since  A*k)  *  0  and  A*°j  *  0. 

i+l>  j 

<v(  J  )  ? )  ',w(  V — 1  ) 

Letting  L  •  A'  7  4  Av  7  +  * • -  4  Av  7  we  note  that  L  is 

strictly  lower  triangular  and  Lr  *  0,  r  >  k„  Thus 

if1  -  (I  4  L)_1  -  I  -  L  4  L2  -  ...  (-l)k_1  Lk-1„ 


THEOREM  8 .  If  is  any  multiplicative  normt 

C„(N)  <  x»(I  +  L)  li)  (I)  4  x>(L)  4  ...  4  [jJ  (L)]k~ 1}  . 
~(r) 

Inasmuch  as  the  elements  of  A  and  hence  L  are  easily  com¬ 
putable  it  is  not  necessary  to  introduce  a  restricted  Schur  form. 


CHAPTER  4 


INTRODUCTION.  Twc  applied  tons  of  the  results  of  the  proceed¬ 
ing  chapters  will  be  considered  here.  Section  4.1  will  deal  with 
bounds  for  norms  of  powers  of  a  fixed  non-normal  matrix..  Bounds 
for  the  norm  of  the.  inverse  of  a  fixed  matrix  are  developed  in 
Section  4.2  and  applied  to  rhe  problem  of  estimating  residual 
vectors  and  matrices  associated  with  the  approximate  solution  of 
linear  systems  and  approximate  inverses. 

Section  4.1  Iterated  Matrices .  The  interest  for  bounds  of 
norms  of  certain  matrices  arises  principally  from  the  study  of 
finite  difference  schemes  for  -olving  hyperbolic  and  parabolic 
differential,  equations  Such  bound*  have  been  given  by  Lax  and 
Richtmeyer  i  183  For  arbitrary  matrices  Gautschi  [9»  10]  and 
Ositowski  [24]  bare  developed  eat '.mates  which  require  some  knowledge 
of  the  Jordan  canonical  form  More  '•ecet.'ly  Henrici  C141  has  given 
bounds  which  depend  upon  the  spectral  radios  and  a  certain  measure 
of  non— normality  introduced  in  his  paper. 

For  normal  ca;rire?  we  hav«  of  course 

(4.0-1)  d<Br)  -  1.5 

D 

as  a  simple  consequence,  of  the  fact  that  normal  matrices  ate  uni— 
tartly  similar  to  diagonal  matrices  and  O'  is  borb  a  unif.artly 
invariant  and  an  axis-oriented  norm 
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In  contrast  to  the  above  result?  for  non-normal  matrices. 
Theorem  9  below  gives  an  estimate  for  <r(B  )  which  depends  upon 
all  the  eigenvalues  of  B  according  to  their  multiplicities,  and 
a  condition  number.  These  results  reduce  to  (4.0—1)  for  B  normal. 

Let  then  B  be  a  given  n  X  n  matrix  and  U,  a  unitary 
matrix  such  that 


A  *  UBU* 

is  an  ordered  S.rhur  form.  Let  N  be  chosen  as  in  Theorem  6  such 
that  Q  ®  N  ^AN  is  quasi-diagonal  with  Q  *  diag  (Q^  ^22“  ‘  °  *  s<^kk^ 
and  *  A^  of  order  n^,  t  *  1,2, .  „ .  k. . 

Thus 

r  r  — 1 
A  »  NQ  N 

If  cr  represents  the  spectral  norm,,  we  have  using  Lemma  6 

(4.1-1)  tf(Ar)  <  Cff(ll)  0'S  diag  (qJj ,Q*2 . Q^)  ) 

<  C  (N)  max  tf(Q*\) 

-  V  ,  ^  ^ ii 


Let.  be  written  as  a  sum  of  a  diagonal  matrix  con¬ 

taining  only  its  diagonal  terms,  and  a  strictly  lower  triangular 
matrix  ,  L^ 


and 


Q  -  D  +  L 
li  1  1 


L  »  0,  r  >  i)  , 
1  —  i 


For  an  arbitrary  ordered  Schur  form  the  are  not  neces¬ 

sarily  scalar  matrices  and  consequently  will  not  commute  with  the 
L^,  preventing  us  from  expanding  (D^  +  L^)r  according  to  the 
binomial  theorem.  But  since  is  diagonal,  if  we  expand  Q 

any  term  with  more  than  n,  -  1  1  *  s  vanishes.  Thus 

l  i 


r-ni+i  "r1 


<4.1-2!  «Q't)  <a;  -([lA-1  1  1 


where  *  ctL^)  and  At  »  X^ 
From  (.4.1— i)  and  (4.2-2) 


li 


otAr)  <  C  (N)  max  \  ^  t  (  '  )  Af1  f  + 
1  <  i  <k  1  1  1 


r  -V  i 

+  (  _,)  A.  1  (  1 


nt-l  '  “i 


r-n/t-1  r.^-1 

i 


Noting  that  Br  *  U*ArU  and  recalling  that  &  is  unitarily 
invariant  we  have: 


tf(Br)  <  C  (N)  max  i  A*  +  (  5  )  A*  1  f .  +  •  •  • 

i.<i<k  L  1  1 

r— n  +1  n  -1 

(  ,  )  A  {  1 

nt-l  i  *i 


The  estimate  holds  for  any  ordered  Schur  form.  Indeed,  since 
specification  of  the  ordering  of  the  eigenvalues  does  not  uniquely 
determine  a  Schur  form  we  can  concludes 


THEOREM  9 „  If  K.  >  0, 

—  O 


o'(Br) 


<  min 


iv 


max 

i 


a  +  c  )  a; 


.  r— 1 


+  ( 


)  A 


r-n  +1 
i 


If  xB-o, 

ff(Br)  <  miofciN)  max  <fl ..  r  -  0.1,... ,M  -  1 
-  L  o  i  Li 

tf(Br)  -  0,  r  >  M 

where  M  *  max  n^  and  where  the  minimum  is  taken  over  all  ordered 
Schur  forms . 

If  B  were  normal  any  Schur  form  would  be  diagonal  implying 
that  N  -  I.  Thus  C  (N)  -  1  and  <r(Br)  <  max  £r  - 

V  —ID 

X 5  <  0"(Br),  <5'(Br)  *  \5  in  agreement  with  (4.0—1). 

D  —  D 


Since 


Section  4„2„  Bounds  for  inverses .  Let  B  be  an  arbitrary 


non— singular  matrix,  b  a  given  vector  and  x  an  alleged  solution 

<v 

of  Bx  *  b„  If  we  define  the  residual  of  x  by  r  *  Bx  —  b,  and 

~  -1  -i  ~ 

If  t)  is  a  vector  norm,  the  error  x  —  B  b«B  r  of  x  can 
be  estimated  as  follows; 

hJG  -  B_1b)  *  i)  (B_ Xr)  <  ^(B_1)  x)(r)5 

where  9^  denotes  any  matrix  norm  compatible  with  i) ,  Similarly, 
if  X  is  an  alleged  inverse  of  B„  and  if  ~l)  is  any  multipli— 
cative  matrix  norm  we  can  calculate  a  bound  for  x/(x  -  B  )  in 

»v  m 

terms  of  the  residual  matrix  R  “  BX  —  I; 

iXX  -  B_1)  *  iAB_1R>  <  ^(B-1)  -^(R)  . 

For  both  problems  we  require  a  bound  for  2^(8  ) .  Such  a 

bound  is,  in  principle,  easily  constructed  if  we  assume  that  B  is 
similar  to  a  diagonal  matrix  D: 

B  -  SDS~ 1 . 

For,  assuming  that  -J  is  an  axis-oriented  norm  and  noting  that 
B-1  «  SD-1S-1  we  have 
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(4.2-1)  tJ(B_1)  <  C  (S)  X  . 

B  1 

If  B  were  normal  then  S  could  be  taken  unitary  and  the 

spectral  condition  number  of  S  would  be  1,  In  this  case 

<T(B  *)  <  X  ,  and  in  view  of  Lemma  4. 

B  1 

<y(B~l)  *  X 

B 

Normal  matrices  are,,  of  course,  the  only  matrices  unitarily 
similar  to  diagonal  matrices.  Estimates  for  non-normal  matrices 
are  not  so  easily  derived  We  see  for  instance  that  the  bound 
(4.2-1),  if  at  all  appllcaole  requires  the  complete  diagonaiizatlon 
of  B. 

If  we  let  the  function  fn  be  defined  for  all  real  x  >  0  by 
,n a  v  ,2  n 

f  (x)  *  X  +■  X  X 

we  note  that  f  and  x  ^f  are  monotoni cally  increasing  for  x  >  0S 
and  that 


lim  x  ^  fn(x)  «  1, 
x  0+ 

With  the  notation  of  the  preceeding  section  we  have  then: 

THEOREM  10,  If  B  is  non-s ingular  and  non-normal .  and  if 
?!  *  A  ^  £  „  then 
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eXB  *■)  <  min 


C  (N)  max 
&  i 


where  the  minimum  is  taking  over  all  ordered  Schur  forma „ 

Proof.,  As  before  we  write  Q  »  D  +  L 
-  il  1  1 


-l 

LI 


r  — i  i 

|D  (I  f  D  *■  L,)  I 

L  1  1  I  J 

U  *  D  1  I  )  1  D 


We  cannot  expand  ( 1  ♦  1  )  “  according  to  the  binomial 

expansion  since  D  is  no:  -*cess arilv  scalar  But  if  we  expand 
1 

without  commuting  f.  is  still  true  tha*  any  term  with  more  than 

n  -  1  L  “s  vanishes  Thus  since 

1  i 


L  >  <  0*D  l>  oil  1  »  a 
i  -  1  '  l  1 


l 


1 


we  have,,  upon  setting 


tf(Q  ;>  <  u  +  4  +  r  + 

1 1  -  i  i 


V1  -i 

- 1  >  v 


f”'V  -l 

- -  A 


6< 


In  view  of 

and 

Of(Q_1) 


■  max  tf(Q  J) 
i  11 


-  tf(A  1), 


and 


tf(B  L)  <  C  .(N)  max 
-  0-  t 


for  every  ordered  Schur  forme  The  theorem  follows „ 


CHAPTER  5 


Spectral  Variation  and  Eigenvalue  Variation 

Classical  Results .  Let  the  matrix  A  *  (a  )  have  eigenvalues 

i  j 

and  let,  B  *  (b^)  have  eigenvalues  ,  i  “  The 

quantity 

s  *  s.(B)  *  max  \  min  l»».  —  X,l 

l<i<n  [  1<  j<n  1  J 

is  called  the  spectral  variation  of  B  with  respect  to  A.  It  is, 
in  effect,  the  maximum  distance  from  any  eigenvalue  of  B  to  the 
spectrum  of  A  No  one-to-one  correspondence  between  eigenvalues 
is  implied.  However,  the  function  v  defined  by 

v  ■  v(A,B)  *  min 

a 

where  the  minimum  is  taken  with  respect  to  all  permutations  of  the 

set  (1, 2, .  „so)  and  which  is  called  the  eigenvalue  variation  of 

A  and  B  does  imply  a  one— to— one  correspondence.  We  have,  of 

course,  v(A,B)  *  v(B,A)  whereas  s.(B)  #  s  (A)  in  general.  In 

A 

addition 

SA(B)  <  v(A,B) 

for  all  matrices  A  and  B .. 


max 

1  <  1  <  n 


IXi~ 


w 
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One  of  the  best  available  bounds 
matrices  is  given  by  Ostrowski  [22] . 


Namely,,  if  M  *  max  (ia 
1  <  1 j  <  n 

defined  in  Chapter  lf  then 


ij 


i  b 


for  s  and  v  for  arbitrary 
(See  also  [24]  p„192)« 
i)  and  if  the  norm  a  is  as 


S  (B)  <  (n  t  2)  K 

A  — 


1/n 


and 


j(A.B)  <  2n(n  + 


a(A  - 


M 


B l  ' 


1/n 


That  the  exponent  1/n  in  these  bounds  cannot  be  improved  in 
general  may  be  seen  by  considering  an  example  due  to  G,  E.  Forsythe 
(see  [28]  p.405)»  In  special  cases,  however,  improvements  are 

pose lble . 

If  A  is  similar  to  a  oiagonal  matrix  D„ 

A  »  SDS~3 

and  if  ■%)  is  any  axis-oriented  lob  norm,  then  Bauer  and  Fike  [3] 
showed  that 


(50-1)  Sa(B)  <  CJS)  A  -  B) „ 

If  further,  A  is  normal  S  may  be  chosen  unitary  and  we 


find  for  any  norm  -O  majorizing  the  spectral  norm 
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(5.0-2)  Sa(B)  <  x)(A  -  B) . 

Jf  A  and  B  -are  both  normal  and  ~0 m  e  it  follows  fro®,  a 
result  of  Hoffman  and  Wielandt  £151  that  (5.0-2.)  is  even  valid  for 
the  eigenvalue  variation 

(5.0-3)  v (A.  B)  <  c(A  -  B) . 

This  result:  has  been  used  frequently  by  Bargmann,  Montogomery 
and  von  Neumans  in  £31  for  A  and  B  either  real  symmetric  or 
hernu*\l  an . 

A  mere  recent  result  applicable  to  arbitrary  matrices  has  been 
contributed  by  Henri  cl  [I4l  .  Because  of  the  pari  that  these 
estimates  play  in  this  chap'er  we  shall  develop  the  necessary 
notation..  These  estimates  depend  in  particular  upon  a  measure  of 
non— normality  which  we  define  here. 

If  A  is  any  matrix,  wt  recall  tha*  (Mirsky ,  [191)  there 
exists  a  unitary  matrix  U  and  a  triangular  matrix  T  such  that 

A  «  UHJ*  „ 

T,  the  Scbur  triangular  form  of  A,  is  no*.;  uniquely  determined  for 
a  given  A.  We  put 

T  *  D  +  M 

where  D  denotes  the  diagonal  matrix  whose  main  diagonal  coincides 
with  that  of  T„  It  follows  them  that  M  is  a  strictly  (lower) 
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triangular  matrix. 

If  iJ  is  a  norm,  the  -^-departure  from  normality  of  A  is 
defined  by 


AjU)  *  inf  tJM 

where  the  Inflnum  is  taken  with  respect  to  all  M  that  can  appear 
in  a  Schur  form.  It  follows  that  Aj/A)  *  0  if  and  only  if  A  is 
normal . 

Let  the  function  g  «  g(y)  be  defined  for  all  real  y  >  0  as 
the  (unique)  non— negative  solution  of  the  equation 

2  n 

g  t  g  g  *  v  , 

The  function  g  is  the  inverse  of  the  function  f  defined  in  4.2. 
For  later  use  we  note  the  relations 

(5.0-4)  lim  y  1  g(y)  «  1 

y  ->  0+ 

(5,0-5)  n  *y  <  g(y)  <  y,  0  <  y  <  n 

(5.0-6)  g(n)  -  1 

(5.0-7)  (n  1y)1/r  <  g(y)  <  y1^",  y  >  c 

lim  y~1/n 
y  — >  co 


(5.0-8) 


g(y)  *  1. 


HenriciJ$  results  may  now  be  given. 


THEOREM  (Henrtri)  let  A  be  a  non-no gal  matrix and  let 
B  -  A  4  0.  If  i.)  is  arc  form  majorizing  tbs  sper.tr.al  none  and  if 


v 


n  (A) 
\j  (  B  -  A) 


tb>n 


?•  <  B)  <  z)(*  -  A'- 

A  -  ft  <  \  ) 

15.0-10)  v(A,B)  <  t  2r.  -  1)  7J1 B  -  A' 

-  giy* 

(5  0-5)  <5.0-  6)  and  (5.0--?)  may  be  u*ed  to  r-oder  r.he  bound 

(5.0-9)  ard  (5.0-10)  mor  *  explicit.  "or  l)\ B  -  A)  bounded  away 
*  tom  ?-ro  (5  0—4'  sbow»  that  as  c\J  A)  0  (he  estimate 
(5  0-9)  approathes  (5.0-?)  <  5  0-8)  shows  -h jt  for  a  fixed  non¬ 

normal  A  and  for  B  —  >  a  the  be  .  od  <5  0-9 )  is  of  the  same  order 
as  (5  0-3). 

It  should  be  mentioned  that.  Wteland*  C293  had  previously  de¬ 
fined  a  measure  of  non-normality  of  a  mar  Tit  ,  H;s  measure  is  ap¬ 
plicable  only  to  ma'Tice#  wh.tth  are  similar  to  a  diagonal  matrix, 
and  requites  the  knowledge  of  a  main*  effecting  the  diagonslizat Ion . 

After  deriving  bounds  for  S  (B)  and  r(A.B)  using  quasl- 
dl agonal  represent*'  sons  we  eha’)  make  a  comparison  between  these, 
results  and  'hose  of  Henrltl  given  abov-1 . 


(5.0-9) 

and 


Section  5.1.  Given  an  arbitrary  matrix  M  let  us  assume 


that  a  unitary  U  has  been  chosen  such  that  A  *  U*MU  is  an 
ordered  Schur  form  and  that  A  has  been  transformed  by  N  into 
Q  *  drag  •  ^2°  “  *  ”  °  s<^kk^  48  by  Th-orem  6.  Let 

\Ly  j  •  l,2,...»ni  be  the  eigenvalues  of  Qtl9  i  -  1,2,. ...k. 
Let  B  be  an  arbitrary  matrix.  We  have  from  above.'. 


N  AAN  *  Q  *  diag  <Q11»Q225““0>Qkk) 


Q  *  D  +  L  , 
XU  i  Is 


being  the  diagonal  matrix  whose  diagonal  elements  coincide  with 

those  of  Q  . 

it 

Let 


XBN  *  B, 

A 


E  «  B  —  A, 


N  lEN  -  F. 


Then 


Bx  *  F  +  Q. 

Let  p  be  an  arbitrary  but  fixed  eigenvalue  of  B  (and 
hence  of  B^)  which  is  not  an  eigenvalue  of  A  Then  (Q  —  jpl)  ^ 
exists  and 
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Thus 

and  - 

But 

so 

(5.1- 

i 

of  A 

(5.1-; 


0  »  det  (Bx  -  pi)  -  det  [Q  -  pi  +  F] 

-  det  [Q  -  pi]  det  [I  +  (Q  -  pi)-1  F]  . 


det  [I  +  (Q  -  pi)  1  F]  -  0 
■l  Is  an  eigenvalue  of  (Q  —  pi)  ^F.  By  Lemma  4 

ff{V  Q  -  ul)-1  fJ  >  1. 

O(F)  <  Ctf(N)  <r(E) 

)  *[«!  -  px)-1]  >  • 

—  pi)  is  non— singular  since  D  contains  only  eigenvalues 
and  consequently 

)  (Qii  -  pi)-1  -  (Dt  +  L±  -  pi)-1 

-  |(Di  -  pi)  [i  +  (Dt  -  pi)-1  L 

-  [i  +  (D±  -  pi)-1  lJ"1  (Dt  -  pl)_l. 


Now 
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(5.1-3)  [i  +  (Dt  -  pi)  1  lJ  -  I  -  (Dt  -  pi)  1 


n  — 1 


,  n  —1 


+  [(Dt  -  pi)  1  Lj2  -  ...  (-1)  1  [<Dt  -  pi)  XlJ  1 


The  last  sum  extends  to  at  most  the  n^  —  1  power  since 
£(D^  —  pi)  1  L^]  r  ■  0„  r  >  n^.  The  validity  of  this  statement  can 
be  verified  by  noting  that  (D^  -  pi)  *  L^  has  the  same  (i.e. 
lower  triangular)  form  as  L  ,  and  that  L*  *  0*  r  >  n^. 

Let  Pi  -  flr[(Dt  -  pi)-1] 


*(V  " 

6*(E)  -  e 

Cg.CN)  -  x. 


Note  that 


P 


i 


max 

1  <  j  <  n^ 


min 

1<  j<  nt 


1 _ 

Uij  -  »*'  ' 


From  (5,1—2)  and  (5.1—3) 


+  P 


V1 


and 
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<s-i-*>  «[«u  -  P«  x]  <  Pt  +  P&  +  •  •  •  +  p”‘  1 

ni 

f  (p^i) 


But 


C  5  .  1—5)  oT(Q  -  pi)  11  *  max  crj~<Q  -  pi)  11 

l<i<k  L  11  J 


Thus  from  (5.1-1),  (5.1-4)  and  (5.1-5) 


l  ^  r  _n  f  <PfO 

—  <  ®«x  <r  (Q  -  pi)  <  max  - 1- ---* 

l<i<k  L  11  J  ~ .  * 


1<  i<  k 


For  some  index  i  * 


iQ  say 


n. 


f  w(p.  C.  ) 

J_  *  *0  io 

**  -  c* 


li  ni 

~  $  f  V  <i  > 

1o  i0 


and  from  the  definition  of  g  u 
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Then 


1  „ 

-  <  max 

PiQ  1<  i<  k 


and 


(5,1—6) 


min  X 
!  <  j<  n  10J 
10 


p  <  max 

l<  i<  k 


Thus  for  any  arbitrary  eigenvalue  ju  *  of  B  it  is  pos¬ 
sible  to  find  some  eigenvalue  of  A  such  that  (5.1-6)  holds. 

From  (5.1-6), 


min  |x  -  p  <  max 
l<i<k  '  1<  i<  k 

1<  j<  nA 


for  r  «  1,2, . .  ,,d  and 


s*<”) 


max  min  X  ,  —  p  <  max 
l<  r<n  isj  r  1<  i<k 


THEOREM  11  Let  A  be  a  non— normal  matrix,  and  let  B  — A  f  0 . 
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If 


Vi  “  C  (N)  tf(B  -  A) 
o 


then 


SA(»)  < 


yi  1 

max  n  >  C  (N)  tf(B  -  A)  . 

l<  i<  k  g  i(yi)  J 


Here  and  are  as  defined  earlier  in  this  section. 

If  x)  is  any  norm  majorizing  S',  we  have  since  g  is  non- 
negative  and  monotone  increasing 


o'(l^) 

«y(Mi) 

<  . .  “ 

-  Ri 

cr(Mi) 

Cff(N)  <y(B  -  A) j 

? 

zy(B  -  A)CV(N) 

and  consequently 


(5.1-7) 


S  .(B)  <;  max  - —  C  ,(N)  ^(B  -  A) 

a  —  i  ,  n  f  i s 

1 .<  i  <k  i,  . 

-  -  g  (ys) 


where 


0‘(M1 ) 

yi  "  C^(N)  z^B  -  A)  • 


Let  now  0  y^  <  a°d  define  x  *  g(y^)  ,  i  *  1,2.  From 
the  monotonicity  of  x  *(f(x))  it  follows  that 
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yl  _  <  .  y2 

g(yL)  *2  g(y2) 

Thus  the  function  y CgCy)]  ^  is  monotonically  increasing.,  and 
we  have  from  (5.1—7)  replacing  <7(1.^)  by  o/(L. ); 

COROLLARY.  .If  A  Is  non— normal  and  B  —  A  f  0  we  have  for 
any  norm  -J  majorizing  O'; 


( 5 . 1—8 ) 


SA«>  < 


,  “* .  [  v»  -  *> 

g  \y.)  J 


T> (Lt) 

yi  *  C/m'PTB^A) 


where 


Section  5,2.  Comparison  of  bounds  and  relat  ed  results . 

Out  object  here  shall  be  to  compare  the  results  given  by 
Henrici  [14] • 


Aj(A) 

(5.0-9)  S,(B)  <  ij(b  —  A)  „  y  *  ~?r - 

A  -  gn(y)  oHB  - 

with  that  derived  earlier  (Corollary  to  Theorem  11.), 


(5.1-8) 


S  (B)  <  max 
A  "  1<  i<k 


n 

8  l(yi) 


C^(N)  iMB  -  A)  ,  yt 


-O  (L^) 

CJS)  i2(B-  A)‘ 


For  this  purpose,  let 


and  note  that 


z 

i 


i/d^ 
aJa)  y 


V, 


C/N) 


For  those  norms  i)  such  that  i^(L^)  <  AJ(A) ,  z^  <  y  and  we 
have  by  the  monotonicity  of  y[g(v)l  ^ 


(5.2-1) 


z 

_i _ 

n  /  v 

g  (*^ 


< 


_JL_ 

gn(v) 


Let  K 


Cj/.N) .  Then  for  these  values  of 


such  that 
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(52-2) 


*  \  y  )  >  gn(*1) 


we  have  from  (5.2—1) 


— -j —  <  ~y_ 

n  z  n.  . 

l  •.  g 


8  (  TT  > 


Writing  this  inequality  in  terms  of  y 


WN>  <  -JL_ 

n  n,  . 

*  *(y1»  8  iy> 


Thus,  for  those  valuer  of  ?  such  that  z.  <  y  and  such 

i  i 

that  (5.2-2)  holds,  the  estimate  (5.1-8)  it  an  improvement  over 
(5.0-9)  given  by  Hentici, 

We  shall  then  be  interested  in  determining  conditions  on  z 
in  terms  of  n,  n^.,  K  such  that 

n 

g  l  jr  }  >  g  U>  o 


For  this  purpose  let  us  introduce  the  function  h(z)  defined  by: 

b(z)  »  g°(z)  -  g  l(  ^  ) 

and  determine  those  values  of  z  such  that  b(z)  <0,  We  may,  of 
course,  assume  that  K  >  1,  for  K  »  1  implies  h(z>  <  0  for 
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positive  values  of  z. 

We  begin  by  investigating  the  positive  zeros  of  h(z) . 
Define 


,  s  n  .  n— 1  ,  ,  1 

p(x)  ax.  +  h  +’““+•  x 


n  +1 


n  n  -1 

4  (1  -  K)  x  1  +  (1  -  K)  x  +  •••  +  (1 


LEMMA  16, 


b(z)  *  0  _Lf  and  only  if  p(x)  »  0> 

where 

x  *  gn(z)  >  0  [z  *  fn(x)] 

Proofs  Suppose  h(z>  *  0  for  some  z  >  0.  Then  gn(z) 
g  (z/K)  „ 

Lett.lng  x  be  this  common  value  we  have 


.2  ,  n 

Z  m  X  +  X  +  ”  ”  ”  +  X 


Z  2  i 

K  *  *  +  *  +”"“+*  • 


Then 


«  \  n  .  n— 1  , 
p*x)  *  X  +  X  + 


+■  X 


n  n  -], 
K(x  1  +  x  + 


+  x)  *  0, 


-  K)  x. 


On  the  other  hand  if  p(x)  *  0, 
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n  .  i*— 1  , 

x  +  *  + 


+  x  *  K(x  +  x 


1*1-1 


+  x) 


Setting  z  ■  fn(x)  we  have 


z 

K 


(x) 


and  consequently 

gn(z)  -  gni(  l  ) 


proving  our  assumption. 

We  have  shown  that  the  positive  (non-negative)  zeros  of  h(z) 
are  in  a  one-to-one  correspondence  with  those  of  p(x) . 

By  Descarte's  rule  of  signs  p(x)  can  have  at  most  one 
positive  zero,  say  x^.  By  the  above  lemma,  h(z)  ■  0  for  at 
most  positive  value  of  z „  namely  z^  *  fn(xg)  , 

Recalling  that 

n 

h(z)  <0  if  and  only  if  g  (  ^  )  >  gn(z) 

we  are  leading  to  the  following  lemmas. 

n 

LEMMA  17.  lj£  g  1(z/K)  >  gn(z)  for  some  z,  then  p(x)  >  0 

ni  n 

where  x  ■  g  X(z/K)  and  p(w)  >  0  where  w  *  gn(z)  . 

n 

Proof.  Letting  x  *  g  l(z/K),  w  *  gn(z)  we  have 
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i  1  * 

x  +  x  +  •  •  •  +  x 

n  .  ““1  . 
w  +  w  +••<+« 


z 

K 

z 


and  thus 

w°  +  w®"1  +  •••+»•  K(x  1  +  x  1  +•••+*). 

By  hypotheauae,  x  >  w  and 

£n(x)  >  f“(w)  -  Kf  i(x)  >  Kf  1(w)  . 

Therefore 

p(x)  -  f“(x)  -  Kf  i(x)  >  0 
_  n 

p(w)  -  fn(w)  -  Kf  1(w)  >  0. 

n 

LEMMA  18 .  If  p(x)  >  0  for  some  x  then  f  l(z/K)  >  gn(z) , 
where  z  “  fn(x)  . 

Proof.  p(x)  >  0  lapliea  that 

n_l  n<  n<_l 

x  +  x  +  •  •  •  +  x  >  K(x  1  +  x  1  +  •  •  •  +  x) 

or 

n  n-l  n<-1 

x  +  x^L  +  ...  +  x  -  K(x  1  +  x  1  +  •  •  •  +  x)  +  Kq 

for  aoxe  q  >  0. 

Let 
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w-g*[f  V>+«] 


so  that 


£  1(w)  -  f  l(x)  +  q, 


Then  v  >  xs  and 

tv— i  n  — 1 

Kn  +•  X1*-  +•  "  •  *  +  X  *  K(w  +  W  +  »  »  •  +  w)  o 

n  n 

Since  2  »  fn(x)  z/K  »  f  1  (w)  and  f  (z/K)  *  w  >  x  *  g  (z)  „ 
Combining  these  two  lemmas  we  see  that 

h(z>  <  0  if  and  only  if  p(x)  >  0S  z  *  fn(x)  [x  *  gn(z)l . 

State  p(x)  >  0  for  large  x,  and  since  p(x)  has  at  most 
one  positive  root,  say  x^,  p(x)  >  0  for  all  x  >  Conse¬ 

quently  h(z)  <  0  for  all  z  >  z^  **  fn(xg)  „  If  p(x)  has  no 
positive  zeros  p(x)  >  0  for  all  x.  >  0  and  h(z)  <  0  for  all 
z  >  0, 

The  determination  of  positive  zeros  of  p(x)  is,  in  general, 
a  difficult  problem.  We  shall  instead  determine  portions  of  the 
positive  x-axis  for  which  p(x)  >  0„  It  is  of  course  only  necessary 
to  find  any  point  x^  >  0  for  which  p(x^)  >  0.  For  then  p(x)  >  0, 
x  >  xQ 

There  ate  three  cases  to  consider . 
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Case  t„  Kn^  *  n . 

In  this  case  p(l)  *  0  and 
p(x)  >  0  for  x  >  1; 
h(z)  <  0  for  z  >  n „ 

Case  2  ,  Kn^  >  n 

If  x  #  l,  we  may  rewrite  p(x)  as 


p(x) 


n.tl. 

nfj.  .  t  . 

*  -1_KX  ~  1 


x  -  1 


x  —  I 


Then 


r(x) 


(x  -  I)  p(x) 

.  n  +1 

xn  1  -  K  x  1  +  (K  -  1) 


has  the  same  sign  as  p(x)  for  x  >  1  and  opposite  sign  for 
x  <  1  .  But 


n— n. 


n  t  1 
n-n. 


-  K 


n-M 

1  +  — - — 
n— n 


+  (K  -  1) 


Thus 


1 


n— n 

p(K 


i 


)  >  0 


*  (K  -  1)  >  0„ 

1 

tV— T* 

since  K  1  >  1.  Hence 


p(y)  >  0 


n— i> 


and  h(z)  <0  at  least  for 


z  >  fn(K 


3 

n— n. 


at  least  for  x  >  K 
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Case  3„  Rty  <  n . 


p(l)  *  n  —  Kn^  >  0 


p(x)  >  0  for  k  >  1 


h(z)  <  0  for  z,  >  f  (1)  * 


THEOREM  12o  For  all  norms  i)  „  such  that  <  A^(A) 


t  *  1  s  k 


g  (  y  )  >  g  (*t»  for  >  n  If  K  Rl  <  n 


Since  y  *  w?  have 


COROLLARY.,  For  all  norms  iJ  such  that 


i  *  1,2,  „ „ o ,k 


,"V 


for 


A^A) 

*•  y>nunrr 


if  Kn  *  n 
—  t 
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3. 


y  >  n 


VA)  f 

JlLj 


Kn  <  n. 
1. 


Therefore  the  estimate  (5.1-8)  represents  an  improvement  over 
(5.0-9)  if  for  each  t 


1. 


y  >  n  max 

l.<  i<  k 


VA) 

oXLt) 


for  Kn  *11 
1 


1 

n—n 

2.  y  >  fn(  K  1  )  max 

1  <  1  <  k 


'V  A) 

___  for  Kn  >  n 

iaI'4  )  1 


3. 


y  >  n  max 

1  <  .1  <  k 


AJA) 

arj 


for  Kn^  <  n , 


The  intersection  of  the  regions  determined  for  each  i  above  yields 
an  interval  of  the  y  axis  for  which  out  estimate  is  preferable. 

The  above  analysis  ts  valid  for  every  ordered  Schur  form  A. 
Therefore  we  have: 

THEOREM  13  Under  the  hypothesise  of  Theorem  11, 


f 


S,(B)  <  min  ( 

I 


max 

1<  i<  k 


g  Cy 


C/N)  x)(B  -  A)  j 


where 


yi  *  c^OO^Cb^-aT 

and  the  m i n imum  is  taken  with  respec t  to  ail  ordered  Schur  forms . 

From  the  fact  that  B  -  A  *  U(U*BU  -  M)U*  and  the  unitary 
invariance  of  G  ff(B  -  A)  •*-  fflU*BU  -  M)  ana  we  may  rewrite  the 
above  theorem  as  follows 

COROLLAR*-  Fo£  non-normal  M  wi  th  M  -  B  #  0  we  have  for 
arg  norm  i)  dominating  G 


(5.2-3) 

f  yt  1 

S.AB)  <  mint  1  max  - 

M  ~  1L  i<  i<k  v  , 

L  •  8 

C  (N)  i)  (U*BU  -  M)  > 

-> 

where 

■  " 

•3 Jn  i 

i 

Vi  '  C^IN)  i) ( -  Ml 

and  the 

minimum  is  taken  with  respect  to  all 

U  occurring  in  an 

ordered  Schur  form  of  M„ 

Related  results  on 

For  given  matrices  M,  B  satisfying  the  hypotheses  of 
Theorem  11  let  6  represent  the  quantity  on  the  right  hand  side 
of  (5,?~3)  The  statement  of  the  above  corollary  may  then  be 
Interpreted  geometrically  by  saying  that  the  spectrum  of  B  is 

contained  in  the  union  T,. 

o 


of  the  disks 
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D  »  (X  ;  IX  -  X  I  <  6}  1  -  1>25 

1  * 

Since  &  — >  0  monotonically  as  U*BU  — >  M,  or  alternately, 
as  B  — >  A  we  may  conclude  by  a  well  known  continuity  argument 
(see  e  „g.  [22])  that  each  component  of  cont  ains  as  many  eigen¬ 

values  of  B  33  of  M  From  this  fact  we  can  obtain,  again  using 
s  well-known  argumert  (see  especially  the  translators  note  in  [23]) 
the  following  result* 

THEOREM  Id,.  Fo£  non-normal  M  with  M  -  B  #  0  we  have  for 
any  norm  i)  dominating  O' 

r_  .  i 

i  |  y  j 

v(M, B)  <  (2k  -  1)  min  j  |  max  — -  I  C  (N)  iKU*BU  -  M) 

lLlsi£k  s  l(v1?J 

where 

V±  C^(N)  x)(U*BU  -  Ml 

and  where  the  minimum  is  taken  with  respect  jto  a. 1.1  U  occurring  in 
an  ordered  Schur  form  of  M. 
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APPENDIX 


The  Matrix  Equation  —AX  +  XB  ■  C  and  Related  Results 

The  problem  of  finding  a  solution  to  the  matrix  equation 
—AX  +•  XB  *  C  where  A,  B  are  square  matrices  of  arbitrary  order 
be  denoted  as  problem  (A) „  It  is  of  importance  in  the  development 
of  a  canonical  form,  which  in  turn  is  applied  to  the  solution  of 
problems  (it),  (iii)  and  (ivl  as  set  forth  in  the  Introduction,, 

Let  8  be  a  Banach  algebra,  with  elements  A,B.Q.,.>...  T  will 
be  an  operator  on  $  such  that 

T(X1  *  —AX  XB  for  every  X  e 

The  following  results  are  to  be  found  in  the  literature. 

Result  1.  [Rut,herfords  [26]] 

Let  (0  be  the  algebra  of  n  X  n  matrices .  If  the  character¬ 
istic  roots  of  A  are  distinct  from  the  characteristic  roots  of 
then  T  ^  exists  and  is  bounded „ 

The  proof,  though  constructive,  depends  upon  a  complete  knowl¬ 
edge  of  the  Jordan  Canonical  form.. 

Result  2.  [Heinz,  [13]] 

Le t  ©  be  the  space  of  bounded  linear  operators  on  a  Hilbert 
space ,  1+,  with  inner  produce  (*.•)•  If  there  exist  real  numbers 

a  and  b  such  that  a  >  b  B  +  B*  <  b,  A  +  A*  >  a, 

then  T  *  exists  as  a  bounded  linear  operator  and  has  the 
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represen tat  ion 


where  by  A  <  O  we  mean  (u,Au)  <  a  for  aLl  <•  £ 

Notable  extension*  of  Result  2  are  given  by  Rosenblum  [253  and 
Cordes  [53  [63 

Result.  3 . 

T 

livens  [11,  glvss  a  formal  solution  of  AX  +  XA  ™  Y  in.  terms 
of  adjolnts.  which  although  not  done  in  bis  paper,  is  immediately 
extendable  to  the  problem  AX  +  XB  *  1,  The  proof  and  subsequent 
simplicity  of  the  representation  depend  upon  the  assumption  of 
simplicity  of  the  roots  of  A  lor  B) ,  a  severe  restriction  for  our 
purposes  These  results  generalize  those  of  Hahn  [12]  for  the  case 
Y  ’  I 

We  shall  see  that,  an  Integral,  representation  of  the  solution 
of  -AX  +  XB  *>  C  similar  to  that  given  by  Result  2  is  valid  under 
assumptions  similar  to  those  In  Result  1.  but  without  the  restriction 
that  A,  B  and  C  be  square  matrices  of  order  n* 

THEOREM  A—l „  A  necessary  and  sufficient  condition  that  problem 
(A)  have  a  solution  for  al 1  C  p*  that  — +  ji  #  0  where  X^ 
are  the  eigenvalues  of  A  and  pp  the  eigenvalues  of  B;  i ,e . a 
l_f  and  only  if  die  e  igenval  ues  of  Adi  ffer  from  those  of  B  „  If 
a  solution  exists  it  is  unljuf: <• 

Proof  Let  A  ®  B  denote  the  Kronecker  product  of  arbitrary 
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matrices  A  and  B  That  is>  A(x)B  is  a  matrix  whose  general 

element  is  a^B„  The  eigenvalues  of  A@B  are  all  the  possible 

products  X  m  where  X  Is  anv  eigenvalue  of  A  and  u.  Is 
lrj  l  j 

*ny  eigenvalue  of  B  [Beilina's,  [4]]  , 

If  we  consider  -AX  *  XB  *  C  as  a  system  of  linear  equations 

in  the  unknowns.  x  the  coefficient  matrix  i?  -A(x)  I  +  I  (X)  B  , 

the  roors  of  which  are  — *  t  p...  Bv  assumption  -X  +*  u.  f  0 

t  i  r] 

Therefore  a  (unique)  solution  exists. 

Result  4  [Bellman  [4],  p  1 1 5] 

If  the  expression 


-4 


co  -At  ,  Bt 
e  Ce  dt 


exists  for  all  C.  it  represents  the  unique  solution  of 


-AX  +  XB  «•  C  . 


Th°  existence  of  the  in'egral  implies  that  llm  z,(t)  *  0  where 

t->co 


z  ( t) 


-At 


Ce 


Bt 


We  shall  examine  the  form  of  z(t)  in  detail.  For  any  square 
scalar  matrix  G,  let:  I  denote  its  Jordan  Canonical  form.  Thus 
there  exists  a  nonslngul.ar  constant  matrix  T  such  that 

G  =  TJT"1  . 


J  has  the  form 


9C 


where  is  a  diagonal  matrix  with  diagonal  entries  s^2»  •  •  • 

and 


where 


0  1  0  .....  0  0\ 

0  0  1  .0.0  0  \ 

„  .  .  .,,,0  0  I 

0  0  0  .00  0  If 

0  0  0  o . .  0  0  / 

\  / 


Is  of  order  r^. 


It  follows  that 


9 


and 


Since 


X.  , .  I  +  z  and  from  the  fact  that  n  . .  I 
q-H.  rt  i  'q-H.  z± 


commutes  with 


z ^  we  have 


tJ  n  z  t 

i  qt-i  i 
e  *  e  e 


Thus 


i  t  jT 


tJ±  \ ,  t;i  o  1  t 


e  *  e 


*q+l 


«r‘_i 

Cr“-l)i 

t'1'2 

(rr2>? 


^00 


1  / 
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where  Is  an  r  X  r  matrix* 

Hence  if  J^,  are,  respectively,  Jordan  Canonical  forms 
for  A  and  B  and  I,  V  ar?  nonsingular  matrices  such  that 

A  »  TJaT“1 ,  B  -  VJBV_l 

it  follows  that 


,  — c  J .  ,  „  t.T_ 

e  At  *  T  e  A  T  1  e® '  *  V  e  6  V  1 


and 


^A  —1  kJ-  —1 
z(t)  -  T  e  T  CV  e  V  \ 


Thus  every  element  of  z(t)  Isa  lineat  combination  of  terms  of 
the  form 


(— X . +  u . ) T 

i  J  ,  \ 
e  J  p  (() 

(< 


where  p^t)  is  a  polynomial  of  degree  not  exceeding  n  +  m  —  2  if 
A  is  of  order  n  and  B  is  of  order  m. 

It.  is  clear  then  that  if  Re  X  >  Re  X  lim  z(t)  *  0.  More- 

1  J  t  ->OD 

over  It  is  clear  in  this  case  that  the  integral  exists  for  all  C 
since  each  element  of  z(t)  ts  tntegrable.  We  have  then  the  fol¬ 
lowing  lemma 

LEMMA  A  A  sufficient  condition  that 


v  _  /too  -At  Bt  , 
X  *  -j  e  Ce  dt 


93 


be  a  solution  of  the  main*:  equation 

-AX  +  XB  *  C 

ii  that  Re  >  Re  for  all  Xt  and  py,  eigenvalues  of  A 
and  B  re  spec  lively  ,  The  sol  ut  ion  if  J^t  exists,  ijs  unique  , 

We  shall  now  show  that  lr  is  possible  to  give  an  integral 
formula  for  X  similar  to  Result  2  which  will  be  valid  if 
Re  X^  >- Re  jj  for  al  l  X^,  p  ^ 

Re  >-Re  /Uj  implies  that  either 

(1)  Re  \x  >  Re  X  . 

1  .1 

or 

(2)  Re  X^  *  Re  ji  and  Fro  X  ^  >  In  p ^  , 

We  have  already  disposed  of  (1)  in  the  pteceedLng  lemma  and 
mus*  now  consider  the  case  where  (2)  bolds  for  some  or  all  of  the 
roots  of  A  and  B. 

Instead,  however,  of  trying  to  find  a  direct  solution  of 


(.A — 1 ) 


— AX  +  XB  -  C, 


we  shall  solve  the  system 


(  A- 21 


■e10  AX 


Xe 10  B 


*•  Ce 


10 


where  0  is  a  real  number  to  be  determined 

Clearly  any  solution  of  (A— ?)  will  be  a  solution  of  (A— 1.)  and 
the  unique  solution  will  be  given  by 
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(  A— 3)  X  »  -  eJ0  exp 


-e10  At]  C  expie10  Btj  dt 


provided  the  integral  exists.. 

The  Idea  is  to  demonstrate  values  of  0  such  rhat. 

10  10 

Be  e  X  >  Be  e  for  all  K ^  ^  and  to  apply  the  preceeding 

lemma  yielding  the  represeoiar  ion  given-  by  (A- 3) 

For  this  let  an  arbitrary  root  of  A  be  given  by  + iy^; 

and  that  of  B  be  u.  ■*  u.  +  iv . .. 

1  J  J 

-  Q 

The  roots  of  —  e*  A  are  l—x .  cos  0  r  y,  sin  0)  +• 

l  i 

10 

lf-y  cos  8  —  x  sin  0)  and  those  of  e  B  are  (u.  cos  0  — 
i  i  j 

v  sin  0)  +  i(v.  cos  0  *■  u.  sm  0)  „ 

3  J  J 

letting 


h  (0)  *  Ho.  -  x  )  cos  0  *  <y.  -  v.)  sin  0] 
i  J  -  ]  i  t  j  J 


t  1 1 (v ,  -  v  )  cos 

l  :  t 


0  +  (u^  —  tin  0] 


we  have  that  every  term  of 


,  ,  i9  r  10,1  .  f  re  _  1 

Zg(t )  «  e  expj-e  Am  C  expje  Btj 

hj.W  (t) 

consists  of  linear  combinations  of  terms  of  the  form  e  J  Fkv  '  ’ 
where  p^lt)  are  polynomials  in  t.  of  finite  degree. 

A  sufficient  condition  rhat  lim  z  (t)  *  0,  and  indeed  that 


t.  -»  CD 


0' 


the  Integral  (A-3)  exist  is  *har 


(A-4) 


Be  h  (01  <  0  for  all  i,  1 
i  j  J 


We  shall  show  that  It  is  always  possible  to  choose  a  0  such 
that  (A-4)  holds .  Indeed  ©  can  be  chosen  from  a  non-degenerate 
interval  „ 

Let 


U~5) 

<A-6) 


Then 


a . ,  *  -  je„  +  u  , 

ij  i  j 


\r  •’t-'y 


Re  h.  (0)  *  a. .  cos  0  t  B  sin  0„ 
ij  ij  ij 

We  have,  by  the  Lexicographic  ordering: 


a  <  0  and  a  *  0  Implies  P  >  0. 
^  J  *  J  *•  J 


If  all  a  <  0  It  is  sufficient  to  choose  6*0 
J 

if  a  *  0  then  Re  h  (0)  <  0  iff  sin  0  <  0,  i„e» 
ij  ij 


x  <  0  <  2.x  „ 

If  some  <  0,  for  Re  h^(0)  <  0  the  following  relation¬ 

ships  must  be  true: 


cos  0  *■  P  sin  0  <  0 


P^  sin  0  <  -  cos  0. 


If,  in  addition,  it  <  0  <  2x> 


♦ 
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* 


ii>d 


(A-  7) 


If 


( A— 8 ) 


a  m  0 


<  0 


r  . 

— iJ-  >  cot  e. 

*  a  . 

ij 


Let  vt. 


v 


min 
is  j 


y 


iff 


co:  0  <  y  implies  that  Re  h„^(0)  <0.  y  +  —  oo  since  this  would 

mean  that  for  some  i  i,  a,  .  *  0,  6  <0  which  contradicts  the 

ij  i] 

assumed  lexicographic  ordering  Since  cot  0  assume  all  value 
between  -  oo  and  +  oo  in  the  interval  n  <  0  <  2*.,  it  is  suf¬ 
ficient  to  choose  0  such  that 


( A-9) 


it  <  rot 


V  <  0  <  ?« 


LEMMA  B  If  y  Is  defined  try  (A— 8)  and  if  6  is  such  that. 
(A-9)  holds  then 

X  »  -  i/q00  expj-e1^  Atl  C  expje1®  Btj  dt 

is  the  unique  solution  of 


-  AX  ■+  XB  *  C- , 
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REMARK  If  all  ae^  <0,  we  may  take  8  ■  0  and 

X  *  -  /°°  e“At  CeBt  dt 
Is  the  unique  solution  of 


-AX  +  XB  »  C  , 


